Skip to content

Use CTE for looking up projects with shared runners enabled

What does this MR do and why?

Related to: gitlab-com/gl-infra/production#5622 (closed)

This MR fixes the query plan flip for a large number of namespaces in the hierarchy. The change is behind a FF: use_cte_for_any_project_with_shared_runners_enabled

The current query times out:

SELECT
  1 AS one
FROM
  "projects"
WHERE
  "projects"."namespace_id" IN (
    SELECT
      traversal_ids[array_length(traversal_ids, 1)] AS id
    FROM
      "namespaces"
    WHERE (traversal_ids @> ('{4909902}'))
    )
  AND "projects"."shared_runners_enabled" = TRUE
LIMIT 1;

Plan (times out):

 Limit  (cost=227.18..18361.05 rows=1 width=4)
   ->  Nested Loop Semi Join  (cost=227.18..3335580541.73 rows=183942 width=4)
         Join Filter: (projects.namespace_id = namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)])
         ->  Index Scan using index_projects_on_namespace_id_and_id on projects  (cost=0.44..1796263.15 rows=20021115 width=4)
               Filter: shared_runners_enabled
         ->  Materialize  (cost=226.74..18390.93 rows=9515 width=25)
               ->  Bitmap Heap Scan on namespaces  (cost=226.74..18343.36 rows=9515 width=25)
                     Recheck Cond: (traversal_ids @> '{4909902}'::integer[])
                     ->  Bitmap Index Scan on index_namespaces_on_traversal_ids  (cost=0.00..224.36 rows=9515 width=0)
                           Index Cond: (traversal_ids @> '{4909902}'::integer[])
(10 rows)

The new query uses a CTE as optimization fence:

WITH "namespace_self_and_descendants_cte" AS MATERIALIZED
  (SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
   FROM "namespaces"
   WHERE (traversal_ids @> ('{4909902}')))
SELECT 1 AS one
FROM projects,
     namespace_self_and_descendants_cte
WHERE (projects.namespace_id = namespace_self_and_descendants_cte.id)
  AND "projects"."shared_runners_enabled" = TRUE
LIMIT 1

Plan:

Limit  (cost=18367.58..18369.02 rows=1 width=4) (actual time=6.854..6.856 rows=1 loops=1)
   CTE namespace_self_and_descendants_cte
     ->  Bitmap Heap Scan on namespaces  (cost=226.74..18367.14 rows=9515 width=4) (actual time=6.247..6.247 rows=1 loops=1)
           Recheck Cond: (traversal_ids @> '{4909902}'::integer[])
           Heap Blocks: exact=1
           ->  Bitmap Index Scan on index_namespaces_on_traversal_ids  (cost=0.00..224.36 rows=9515 width=0) (actual time=4.797..4.798 rows=10236 loops=1)
                 Index Cond: (traversal_ids @> '{4909902}'::integer[])
   ->  Nested Loop  (cost=0.44..265189.82 rows=183943 width=4) (actual time=6.852..6.853 rows=1 loops=1)
         ->  CTE Scan on namespace_self_and_descendants_cte  (cost=0.00..190.30 rows=9515 width=4) (actual time=6.250..6.250 rows=1 loops=1)
         ->  Index Scan using index_projects_on_namespace_id_and_id on projects  (cost=0.44..27.66 rows=19 width=4) (actual time=0.599..0.599 rows=1 loops=1)
               Index Cond: (namespace_id = namespace_self_and_descendants_cte.id)
               Filter: shared_runners_enabled
 Planning Time: 0.526 ms
 Execution Time: 7.014 ms
(14 rows)

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Adam Hegyi

Merge request reports