Linear traversal version of EE::Namespace#any_project_with_shared_runners_enabled? is slow

The original query using recursive traversal query:

SELECT
  1 AS one
FROM
  "projects"
WHERE
  "projects"."namespace_id" IN ( WITH RECURSIVE "base_and_descendants" AS (
(
        SELECT
          "namespaces".*
        FROM
          "namespaces"
        WHERE
          "namespaces"."type" = 'Group'
          AND "namespaces"."id" = 4249178)
      UNION (
        SELECT
          "namespaces".*
        FROM
          "namespaces",
          "base_and_descendants"
        WHERE
          "namespaces"."type" = 'Group'
          AND "namespaces"."parent_id" = "base_and_descendants"."id"))
      SELECT
        id
      FROM
        "base_and_descendants" AS "namespaces")
    AND "projects"."shared_runners_enabled" = TRUE
  LIMIT 1

Timings and plan from database-lab: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/4104/commands/14337

The query produced using linear traversal queries

SELECT
  1 AS one
FROM
  "projects"
WHERE
  "projects"."namespace_id" IN (
    SELECT
      "namespaces"."id"
    FROM
      "namespaces"
    WHERE (traversal_ids @> ('{4249178}')))
  AND "projects"."shared_runners_enabled" = TRUE
LIMIT 1;

Timings and plan from database-lab: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/4104/commands/14336

The linear version seems to be performing better than the recursive one. However, we can see this query running for over 10 seconds in production (eg. check correclation_id: 01F61XDEZHKWNPYEFVV7892RHH).

When I run EXPLAIN ANALYZE on the read-only replica, it gives me a different query plan:

                                                                           QUERY PLAN                                                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.87..102.50 rows=1 width=4) (actual time=11363.502..11363.504 rows=0 loops=1)
   ->  Nested Loop  (cost=0.87..975117.81 rows=9595 width=4) (actual time=11363.501..11363.502 rows=0 loops=1)
         ->  Index Scan using namespaces_pkey on namespaces  (cost=0.43..808139.83 rows=5755 width=4) (actual time=2248.267..11292.270 rows=5524 loops=1)
               Filter: (traversal_ids @> '{4249178}'::integer[])
               Rows Removed by Filter: 10921772
         ->  Index Scan using index_projects_on_namespace_id_and_id on projects  (cost=0.44..28.82 rows=19 width=4) (actual time=0.012..0.012 rows=0 loops=5524)
               Index Cond: (namespace_id = namespaces.id)
               Filter: shared_runners_enabled
               Rows Removed by Filter: 6
 Planning Time: 1.478 ms
 Execution Time: 11363.576 ms

The difference is that it uses namespaces_pkey index instead of index_namespaces_on_traversal_ids.

Edited May 20, 2021 by Imre Farkas