Draft: Replace all_objects call with multiple groups for linear version
What does this MR do and why?
Model ObjectHierarchy
allows you to provide some groups to get the ancestors from and others to get the descendants from. This situation doesn't align very well with the new linear strategy.
In this MR we're changing those references where we pass both params and, instead, get the data and performing a union, getting the same result.
How to set up and validate locally
Feature.enable :use_traversal_ids
Feature.enable :traversal_ids_btree
Feature.enable :use_traversal_ids_for_ancestor_scopes
Feature.enable :use_traversal_ids_for_descendants_scopes
Feature.enable :replace_multiple_groups_all_objects
user = User.find_by(admin: false)
GroupsFinder.new(user).execute.explain # For GroupsFinder
IssuesFinder.new(user, { milestone_title: Milestone::Upcoming.name }).execute # For Issuable::Params
SQL queries
All the times have been retrieved using a very expensive user in order to test the worst scenario.
The former query was:
WITH RECURSIVE "base_and_ancestors" AS (
(SELECT *
FROM (
(WITH "direct_groups" AS MATERIALIZED
(SELECT *
FROM (
(SELECT "namespaces".*
FROM "namespaces"
INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
WHERE "members"."type" = 'GroupMember'
AND "members"."source_type" = 'Namespace'
AND "namespaces"."type" = 'Group'
AND "members"."user_id" = 1614863
AND "members"."requested_at" IS NULL
AND (access_level >= 10))
UNION
(SELECT "namespaces".*
FROM "projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
WHERE "project_authorizations"."user_id" = 1614863)) namespaces
WHERE "namespaces"."type" = 'Group') SELECT *
FROM (
(SELECT "namespaces".*
FROM "direct_groups" "namespaces"
WHERE "namespaces"."type" = 'Group')
UNION
(SELECT "namespaces".*
FROM "namespaces"
INNER JOIN "group_group_links" ON "group_group_links"."shared_group_id" = "namespaces"."id"
WHERE "namespaces"."type" = 'Group'
AND "group_group_links"."shared_with_group_id" IN
(SELECT "namespaces"."id"
FROM "direct_groups" "namespaces"
WHERE "namespaces"."type" = 'Group'))) namespaces
WHERE "namespaces"."type" = 'Group')
UNION
(SELECT "namespaces".*
FROM "namespaces"
INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
WHERE "members"."type" = 'GroupMember'
AND "members"."source_type" = 'Namespace'
AND "namespaces"."type" = 'Group'
AND "members"."user_id" = 1614863
AND "members"."access_level" = 5)) namespaces
WHERE "namespaces"."type" = 'Group')
UNION
(SELECT "namespaces".*
FROM "namespaces",
"base_and_ancestors"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."id" = "base_and_ancestors"."parent_id")),
"base_and_descendants" AS (
(SELECT "namespaces".*
FROM "namespaces"
INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
WHERE "members"."type" = 'GroupMember'
AND "members"."source_type" = 'Namespace'
AND "namespaces"."type" = 'Group'
AND "members"."user_id" = 1614863
AND "members"."requested_at" IS NULL
AND (access_level >= 10))
UNION
(SELECT "namespaces".*
FROM "namespaces",
"base_and_descendants"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."parent_id" = "base_and_descendants"."id"))
SELECT "namespaces".*
FROM (
(SELECT "namespaces".*
FROM "base_and_ancestors" AS "namespaces"
WHERE "namespaces"."type" = 'Group')
UNION
(SELECT "namespaces".*
FROM "base_and_descendants" AS "namespaces"
WHERE "namespaces"."type" = 'Group')) namespaces
WHERE "namespaces"."type" = 'Group'
This is the execution time and the times are:
Time: 552.207 ms
- planning: 10.615 ms
- execution: 541.592 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 289012 (~2.20 GiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
The new query is:
SELECT "namespaces".*
FROM (
(WITH "base_ancestors_cte" AS MATERIALIZED
(SELECT "namespaces"."id", "namespaces"."traversal_ids"
FROM (
(WITH "direct_groups" AS MATERIALIZED
(SELECT "namespaces".*
FROM (
(SELECT "namespaces".*
FROM "namespaces"
INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
WHERE "members"."type" = 'GroupMember'
AND "members"."source_type" = 'Namespace'
AND "namespaces"."type" = 'Group'
AND "members"."user_id" = 1614863
AND "members"."requested_at" IS NULL
AND (access_level >= 10))
UNION
(SELECT "namespaces".*
FROM "projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
WHERE "project_authorizations"."user_id" = 1614863)) namespaces
WHERE "namespaces"."type" = 'Group') SELECT "namespaces".*
FROM (
(SELECT "namespaces".*
FROM "direct_groups" "namespaces"
WHERE "namespaces"."type" = 'Group')
UNION
(SELECT "namespaces".*
FROM "namespaces"
INNER JOIN "group_group_links" ON "group_group_links"."shared_group_id" = "namespaces"."id"
WHERE "namespaces"."type" = 'Group'
AND "group_group_links"."shared_with_group_id" IN
(SELECT "namespaces"."id"
FROM "direct_groups" "namespaces"
WHERE "namespaces"."type" = 'Group'))) namespaces
WHERE "namespaces"."type" = 'Group')
UNION
(SELECT "namespaces".*
FROM "namespaces"
INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
WHERE "members"."type" = 'GroupMember'
AND "members"."source_type" = 'Namespace'
AND "namespaces"."type" = 'Group'
AND "members"."user_id" = 1614863
AND "members"."access_level" = 5)) namespaces
WHERE "namespaces"."type" = 'Group'),
"ancestors_cte" AS MATERIALIZED
(SELECT id AS base_id, unnest(traversal_ids) AS ancestor_id
FROM "base_ancestors_cte") SELECT DISTINCT "namespaces".*
FROM "ancestors_cte",
"namespaces"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."id" = "ancestors_cte"."ancestor_id")
UNION
(WITH "descendants_base_cte" AS MATERIALIZED
(SELECT "namespaces"."traversal_ids", LEAD (namespaces.traversal_ids, 1) OVER (
ORDER BY namespaces.traversal_ids ASC) next_traversal_ids
FROM "namespaces"
INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
WHERE "members"."type" = 'GroupMember'
AND "members"."source_type" = 'Namespace'
AND "namespaces"."type" = 'Group'
AND "members"."user_id" = 1614863
AND "members"."requested_at" IS NULL
AND (access_level >= 10)),
"descendants_cte" AS MATERIALIZED
(SELECT "namespaces".*
FROM "descendants_base_cte", "namespaces"
WHERE "namespaces"."type" = 'Group'
AND ("descendants_base_cte"."next_traversal_ids" IS NULL
OR "descendants_base_cte"."next_traversal_ids" > "namespaces"."traversal_ids")
AND next_traversal_ids_sibling("descendants_base_cte"."traversal_ids") > "namespaces"."traversal_ids"
AND "descendants_base_cte"."traversal_ids" <= "namespaces"."traversal_ids") SELECT "namespaces".*
FROM "descendants_cte" AS "namespaces")) namespaces
This is the execution time and the times are:
Time: 642.182 ms
- planning: 8.096 ms
- execution: 634.086 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 317059 (~2.40 GiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
The new query seems to be slightly more expensive than the original one. With more lightweight users, the new query is still lower but only by 50-100ms.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #349167 (closed)
Closes #353424 (closed)