Skip to content

Draft: Replace all_objects call with multiple groups for linear version

What does this MR do and why?

Model ObjectHierarchy allows you to provide some groups to get the ancestors from and others to get the descendants from. This situation doesn't align very well with the new linear strategy.

In this MR we're changing those references where we pass both params and, instead, get the data and performing a union, getting the same result.

How to set up and validate locally

Feature.enable :use_traversal_ids
Feature.enable :traversal_ids_btree
Feature.enable :use_traversal_ids_for_ancestor_scopes
Feature.enable :use_traversal_ids_for_descendants_scopes
Feature.enable :replace_multiple_groups_all_objects

user = User.find_by(admin: false)
GroupsFinder.new(user).execute.explain # For GroupsFinder
IssuesFinder.new(user, { milestone_title: Milestone::Upcoming.name }).execute # For Issuable::Params

SQL queries

All the times have been retrieved using a very expensive user in order to test the worst scenario.

The former query was:

WITH RECURSIVE "base_and_ancestors" AS (
                                          (SELECT *
                                           FROM (
                                                   (WITH "direct_groups" AS MATERIALIZED
                                                      (SELECT *
                                                       FROM (
                                                               (SELECT "namespaces".*
                                                                FROM "namespaces"
                                                                INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
                                                                WHERE "members"."type" = 'GroupMember'
                                                                  AND "members"."source_type" = 'Namespace'
                                                                  AND "namespaces"."type" = 'Group'
                                                                  AND "members"."user_id" = 1614863
                                                                  AND "members"."requested_at" IS NULL
                                                                  AND (access_level >= 10))
                                                             UNION
                                                               (SELECT "namespaces".*
                                                                FROM "projects"
                                                                INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
                                                                INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
                                                                WHERE "project_authorizations"."user_id" = 1614863)) namespaces
                                                       WHERE "namespaces"."type" = 'Group') SELECT *
                                                    FROM (
                                                            (SELECT "namespaces".*
                                                             FROM "direct_groups" "namespaces"
                                                             WHERE "namespaces"."type" = 'Group')
                                                          UNION
                                                            (SELECT "namespaces".*
                                                             FROM "namespaces"
                                                             INNER JOIN "group_group_links" ON "group_group_links"."shared_group_id" = "namespaces"."id"
                                                             WHERE "namespaces"."type" = 'Group'
                                                               AND "group_group_links"."shared_with_group_id" IN
                                                                 (SELECT "namespaces"."id"
                                                                  FROM "direct_groups" "namespaces"
                                                                  WHERE "namespaces"."type" = 'Group'))) namespaces
                                                    WHERE "namespaces"."type" = 'Group')
                                                 UNION
                                                   (SELECT "namespaces".*
                                                    FROM "namespaces"
                                                    INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
                                                    WHERE "members"."type" = 'GroupMember'
                                                      AND "members"."source_type" = 'Namespace'
                                                      AND "namespaces"."type" = 'Group'
                                                      AND "members"."user_id" = 1614863
                                                      AND "members"."access_level" = 5)) namespaces
                                           WHERE "namespaces"."type" = 'Group')
                                        UNION
                                          (SELECT "namespaces".*
                                           FROM "namespaces",
                                                "base_and_ancestors"
                                           WHERE "namespaces"."type" = 'Group'
                                             AND "namespaces"."id" = "base_and_ancestors"."parent_id")),
               "base_and_descendants" AS (
                                            (SELECT "namespaces".*
                                             FROM "namespaces"
                                             INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
                                             WHERE "members"."type" = 'GroupMember'
                                               AND "members"."source_type" = 'Namespace'
                                               AND "namespaces"."type" = 'Group'
                                               AND "members"."user_id" = 1614863
                                               AND "members"."requested_at" IS NULL
                                               AND (access_level >= 10))
                                          UNION
                                            (SELECT "namespaces".*
                                             FROM "namespaces",
                                                  "base_and_descendants"
                                             WHERE "namespaces"."type" = 'Group'
                                               AND "namespaces"."parent_id" = "base_and_descendants"."id"))
SELECT "namespaces".*
FROM (
        (SELECT "namespaces".*
         FROM "base_and_ancestors" AS "namespaces"
         WHERE "namespaces"."type" = 'Group')
      UNION
        (SELECT "namespaces".*
         FROM "base_and_descendants" AS "namespaces"
         WHERE "namespaces"."type" = 'Group')) namespaces
WHERE "namespaces"."type" = 'Group'

This is the execution time and the times are:

Time: 552.207 ms
  - planning: 10.615 ms
  - execution: 541.592 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 289012 (~2.20 GiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

The new query is:

SELECT "namespaces".*
FROM (
        (WITH "base_ancestors_cte" AS MATERIALIZED
           (SELECT "namespaces"."id", "namespaces"."traversal_ids"
            FROM (
                    (WITH "direct_groups" AS MATERIALIZED
                       (SELECT "namespaces".*
                        FROM (
                                (SELECT "namespaces".*
                                 FROM "namespaces"
                                 INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
                                 WHERE "members"."type" = 'GroupMember'
                                   AND "members"."source_type" = 'Namespace'
                                   AND "namespaces"."type" = 'Group'
                                   AND "members"."user_id" = 1614863
                                   AND "members"."requested_at" IS NULL
                                   AND (access_level >= 10))
                              UNION
                                (SELECT "namespaces".*
                                 FROM "projects"
                                 INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
                                 INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
                                 WHERE "project_authorizations"."user_id" = 1614863)) namespaces
                        WHERE "namespaces"."type" = 'Group') SELECT "namespaces".*
                     FROM (
                             (SELECT "namespaces".*
                              FROM "direct_groups" "namespaces"
                              WHERE "namespaces"."type" = 'Group')
                           UNION
                             (SELECT "namespaces".*
                              FROM "namespaces"
                              INNER JOIN "group_group_links" ON "group_group_links"."shared_group_id" = "namespaces"."id"
                              WHERE "namespaces"."type" = 'Group'
                                AND "group_group_links"."shared_with_group_id" IN
                                  (SELECT "namespaces"."id"
                                   FROM "direct_groups" "namespaces"
                                   WHERE "namespaces"."type" = 'Group'))) namespaces
                     WHERE "namespaces"."type" = 'Group')
                  UNION
                    (SELECT "namespaces".*
                     FROM "namespaces"
                     INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
                     WHERE "members"."type" = 'GroupMember'
                       AND "members"."source_type" = 'Namespace'
                       AND "namespaces"."type" = 'Group'
                       AND "members"."user_id" = 1614863
                       AND "members"."access_level" = 5)) namespaces
            WHERE "namespaces"."type" = 'Group'),
              "ancestors_cte" AS MATERIALIZED
           (SELECT id AS base_id, unnest(traversal_ids) AS ancestor_id
            FROM "base_ancestors_cte") SELECT DISTINCT "namespaces".*
         FROM "ancestors_cte",
              "namespaces"
         WHERE "namespaces"."type" = 'Group'
           AND "namespaces"."id" = "ancestors_cte"."ancestor_id")
      UNION
        (WITH "descendants_base_cte" AS MATERIALIZED
           (SELECT "namespaces"."traversal_ids", LEAD (namespaces.traversal_ids, 1) OVER (
                                                                                          ORDER BY namespaces.traversal_ids ASC) next_traversal_ids
            FROM "namespaces"
            INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
            WHERE "members"."type" = 'GroupMember'
              AND "members"."source_type" = 'Namespace'
              AND "namespaces"."type" = 'Group'
              AND "members"."user_id" = 1614863
              AND "members"."requested_at" IS NULL
              AND (access_level >= 10)),
              "descendants_cte" AS MATERIALIZED
           (SELECT "namespaces".*
            FROM "descendants_base_cte", "namespaces"
            WHERE "namespaces"."type" = 'Group'
              AND ("descendants_base_cte"."next_traversal_ids" IS NULL
                   OR "descendants_base_cte"."next_traversal_ids" > "namespaces"."traversal_ids")
              AND next_traversal_ids_sibling("descendants_base_cte"."traversal_ids") > "namespaces"."traversal_ids"
              AND "descendants_base_cte"."traversal_ids" <= "namespaces"."traversal_ids") SELECT "namespaces".*
         FROM "descendants_cte" AS "namespaces")) namespaces

This is the execution time and the times are:

Time: 642.182 ms
  - planning: 8.096 ms
  - execution: 634.086 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 317059 (~2.40 GiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

The new query seems to be slightly more expensive than the original one. With more lightweight users, the new query is still lower but only by 50-100ms.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #349167 (closed)

Closes #353424 (closed)

Edited by Alex Pooley

Merge request reports