Skip to content

WIP Refactor method in EE::Group

Etienne Baqué requested to merge ebaque-refactor-ee-group into master

What does this MR do?

This MR refactors some code within the EE::Group class.

Follow up from !48368 (merged)

DB: Explain plan

Queries for the billed_user_ids method (with group being gitlab-org)

Group.find(9970).billed_user_ids

master version

(4 queries)
SELECT DISTINCT "members"."user_id" FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'GroupMember' AND "members"."source_type" = 'Namespace' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND "members"."source_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "namespaces"."id" FROM "base_and_descendants" AS "namespaces")

SELECT DISTINCT "members"."user_id" FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'ProjectMember' AND "members"."source_type" = 'Project' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND ("users"."user_type" IS NULL OR "users"."user_type" != 6) AND "members"."source_id" IN (SELECT "projects"."id" FROM "projects" INNER JOIN "namespaces" ON "namespaces"."type" = 'Group' AND "namespaces"."id" = "projects"."namespace_id" AND "namespaces"."type" = 'Group' WHERE "projects"."namespace_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT id FROM "base_and_descendants" AS "namespaces"))

SELECT DISTINCT "members"."user_id" FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'GroupMember' AND "members"."source_type" = 'Namespace' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND "members"."source_id" IN (WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" INNER JOIN "group_group_links" ON "group_group_links"."shared_with_group_id" = "namespaces"."id" WHERE "namespaces"."type" = 'Group' AND "group_group_links"."shared_group_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "namespaces"."id" FROM "base_and_descendants" AS "namespaces"))
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "namespaces"."id" FROM "base_and_ancestors" AS "namespaces")

SELECT DISTINCT "members"."user_id" FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'GroupMember' AND "members"."source_type" = 'Namespace' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND "members"."source_id" IN (WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" INNER JOIN "project_group_links" ON "project_group_links"."group_id" = "namespaces"."id" WHERE "namespaces"."type" = 'Group' AND "project_group_links"."project_id" IN (SELECT "projects"."id" FROM "projects" INNER JOIN routes rs ON rs.source_id = projects.id AND rs.source_type = 'Project' WHERE (rs.path LIKE 'gitlab-org/%')))
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "namespaces"."id" FROM "base_and_ancestors" AS "namespaces")

4 explain plans:

Time: 23.493 ms
  - planning: 1.964 ms
  - execution: 21.529 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 12875 (~100.60 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

---

Time: 23.299 ms
  - planning: 3.383 ms
  - execution: 19.916 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 13680 (~106.90 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

---

Time: 42.739 ms
  - planning: 3.502 ms
  - execution: 39.237 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 15165 (~118.50 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

---

Time: 46.519 ms
  - planning: 4.356 ms
  - execution: 42.163 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 24582 (~192.00 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

queries on master, with explain.depesz.com

Plans for these 4 queries, this time using https://explain.depesz.com:

                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=146.63..148.65 rows=808 width=138) (actual time=55.009..55.012 rows=71 loops=1)
   Sort Key: n.nspname, p.proname, (pg_get_function_arguments(p.oid))
   Sort Method: quicksort  Memory: 43kB
   ->  Hash Join  (cost=1.14..107.61 rows=808 width=138) (actual time=42.495..54.854 rows=71 loops=1)
         Hash Cond: (p.pronamespace = n.oid)
         ->  Seq Scan on pg_proc p  (cost=0.00..89.30 rows=808 width=78) (actual time=0.052..53.465 rows=2402 loops=1)
               Filter: pg_function_is_visible(oid)
         ->  Hash  (cost=1.09..1.09 rows=4 width=68) (actual time=0.011..0.011 rows=4 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 1kB
               ->  Seq Scan on pg_namespace n  (cost=0.00..1.09 rows=4 width=68) (actual time=0.005..0.007 rows=4 loops=1)
                     Filter: ((nspname <> 'pg_catalog'::name) AND (nspname <> 'information_schema'::name))


                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=146.63..148.65 rows=808 width=138) (actual time=55.009..55.012 rows=71 loops=1)
   Sort Key: n.nspname, p.proname, (pg_get_function_arguments(p.oid))
   Sort Method: quicksort  Memory: 43kB
   ->  Hash Join  (cost=1.14..107.61 rows=808 width=138) (actual time=42.495..54.854 rows=71 loops=1)
         Hash Cond: (p.pronamespace = n.oid)
         ->  Seq Scan on pg_proc p  (cost=0.00..89.30 rows=808 width=78) (actual time=0.052..53.465 rows=2402 loops=1)
               Filter: pg_function_is_visible(oid)
         ->  Hash  (cost=1.09..1.09 rows=4 width=68) (actual time=0.011..0.011 rows=4 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 1kB
               ->  Seq Scan on pg_namespace n  (cost=0.00..1.09 rows=4 width=68) (actual time=0.005..0.007 rows=4 loops=1)
                     Filter: ((nspname <> 'pg_catalog'::name) AND (nspname <> 'information_schema'::name))


                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=146.63..148.65 rows=808 width=138) (actual time=55.009..55.012 rows=71 loops=1)
   Sort Key: n.nspname, p.proname, (pg_get_function_arguments(p.oid))
   Sort Method: quicksort  Memory: 43kB
   ->  Hash Join  (cost=1.14..107.61 rows=808 width=138) (actual time=42.495..54.854 rows=71 loops=1)
         Hash Cond: (p.pronamespace = n.oid)
         ->  Seq Scan on pg_proc p  (cost=0.00..89.30 rows=808 width=78) (actual time=0.052..53.465 rows=2402 loops=1)
               Filter: pg_function_is_visible(oid)
         ->  Hash  (cost=1.09..1.09 rows=4 width=68) (actual time=0.011..0.011 rows=4 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 1kB
               ->  Seq Scan on pg_namespace n  (cost=0.00..1.09 rows=4 width=68) (actual time=0.005..0.007 rows=4 loops=1)
                     Filter: ((nspname <> 'pg_catalog'::name) AND (nspname <> 'information_schema'::name))


                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=146.63..148.65 rows=808 width=138) (actual time=55.009..55.012 rows=71 loops=1)
   Sort Key: n.nspname, p.proname, (pg_get_function_arguments(p.oid))
   Sort Method: quicksort  Memory: 43kB
   ->  Hash Join  (cost=1.14..107.61 rows=808 width=138) (actual time=42.495..54.854 rows=71 loops=1)
         Hash Cond: (p.pronamespace = n.oid)
         ->  Seq Scan on pg_proc p  (cost=0.00..89.30 rows=808 width=78) (actual time=0.052..53.465 rows=2402 loops=1)
               Filter: pg_function_is_visible(oid)
         ->  Hash  (cost=1.09..1.09 rows=4 width=68) (actual time=0.011..0.011 rows=4 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 1kB
               ->  Seq Scan on pg_namespace n  (cost=0.00..1.09 rows=4 width=68) (actual time=0.005..0.007 rows=4 loops=1)
                     Filter: ((nspname <> 'pg_catalog'::name) AND (nspname <> 'information_schema'::name))

With ::Member.from_union

For Group.find(9970).billed_user_ids:

EXPLAIN SELECT DISTINCT "members"."user_id" FROM ((SELECT "members".* FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'GroupMember' AND "members"."source_type" = 'Namespace' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND "members"."source_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "namespaces"."id" FROM "base_and_descendants" AS "namespaces"))
UNION
(SELECT "members".* FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'ProjectMember' AND "members"."source_type" = 'Project' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND ("users"."user_type" IS NULL OR "users"."user_type" != 6) AND "members"."source_id" IN (SELECT "projects"."id" FROM "projects" INNER JOIN "namespaces" ON "namespaces"."type" = 'Group' AND "namespaces"."id" = "projects"."namespace_id" AND "namespaces"."type" = 'Group' WHERE "projects"."namespace_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT id FROM "base_and_descendants" AS "namespaces")))
UNION
(SELECT "members".* FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'GroupMember' AND "members"."source_type" = 'Namespace' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND "members"."source_id" IN (WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" INNER JOIN "group_group_links" ON "group_group_links"."shared_with_group_id" = "namespaces"."id" WHERE "namespaces"."type" = 'Group' AND "group_group_links"."shared_group_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "namespaces"."id" FROM "base_and_descendants" AS "namespaces"))
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "namespaces"."id" FROM "base_and_ancestors" AS "namespaces"))
UNION
(SELECT "members".* FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'GroupMember' AND "members"."source_type" = 'Namespace' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND "members"."source_id" IN (WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" INNER JOIN "project_group_links" ON "project_group_links"."group_id" = "namespaces"."id" WHERE "namespaces"."type" = 'Group' AND "project_group_links"."project_id" IN (SELECT "projects"."id" FROM "projects" INNER JOIN routes rs ON rs.source_id = projects.id AND rs.source_type = 'Project' WHERE (rs.path LIKE 'gitlab-org/%')))
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "namespaces"."id" FROM "base_and_ancestors" AS "namespaces"))) members
Time: 136.281 ms
  - planning: 11.420 ms
  - execution: 124.861 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 66302 (~518.00 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Recommendations:

  • Query processes too much data to return a relatively small number of rows.
  • Specialized index needed
  • VACUUM ANALYZE needed

For Group.find(9970).billed_user_ids('gold'):

SELECT DISTINCT "members"."user_id" FROM ((SELECT "members".* FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'GroupMember' AND "members"."source_type" = 'Namespace' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND "members"."source_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "namespaces"."id" FROM "base_and_descendants" AS "namespaces") AND (members.access_level > 10))
UNION
(SELECT "members".* FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'ProjectMember' AND "members"."source_type" = 'Project' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND ("users"."user_type" IS NULL OR "users"."user_type" != 6) AND "members"."source_id" IN (SELECT "projects"."id" FROM "projects" INNER JOIN "namespaces" ON "namespaces"."type" = 'Group' AND "namespaces"."id" = "projects"."namespace_id" AND "namespaces"."type" = 'Group' WHERE "projects"."namespace_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT id FROM "base_and_descendants" AS "namespaces")) AND (members.access_level > 10))
UNION
(SELECT "members".* FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'GroupMember' AND "members"."source_type" = 'Namespace' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND "members"."source_id" IN (WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" INNER JOIN "group_group_links" ON "group_group_links"."shared_with_group_id" = "namespaces"."id" WHERE "namespaces"."type" = 'Group' AND "group_group_links"."shared_group_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "namespaces"."id" FROM "base_and_descendants" AS "namespaces") AND (group_access > 10))
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "namespaces"."id" FROM "base_and_ancestors" AS "namespaces") AND (members.access_level > 10))
UNION
(SELECT "members".* FROM "members" LEFT OUTER JOIN "users" ON "members"."user_id" = "users"."id" WHERE "members"."type" = 'GroupMember' AND "members"."source_type" = 'Namespace' AND "users"."state" = 'active' AND "members"."requested_at" IS NULL AND "members"."invite_token" IS NULL AND (members.access_level > 5) AND "members"."source_id" IN (WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" INNER JOIN "project_group_links" ON "project_group_links"."group_id" = "namespaces"."id" WHERE "namespaces"."type" = 'Group' AND "project_group_links"."project_id" IN (SELECT "projects"."id" FROM "projects" INNER JOIN routes rs ON rs.source_id = projects.id AND rs.source_type = 'Project' WHERE (rs.path LIKE 'gitlab-org/%')) AND (group_access > 10))
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "namespaces"."id" FROM "base_and_ancestors" AS "namespaces") AND (members.access_level > 10))) members
Time: 142.349 ms
  - planning: 10.592 ms
  - execution: 131.757 ms
    - I/O read: 6.341 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 66134 (~516.70 MiB) from the buffer pool
  - reads: 7 (~56.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Recommendations:

  • Query processes too much data to return a relatively small number of rows
  • Specialized index needed
  • VACUUM ANALYZE needed

With ::Member.from_union, with explain.depesz.com

                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=146.63..148.65 rows=808 width=138) (actual time=55.009..55.012 rows=71 loops=1)
   Sort Key: n.nspname, p.proname, (pg_get_function_arguments(p.oid))
   Sort Method: quicksort  Memory: 43kB
   ->  Hash Join  (cost=1.14..107.61 rows=808 width=138) (actual time=42.495..54.854 rows=71 loops=1)
         Hash Cond: (p.pronamespace = n.oid)
         ->  Seq Scan on pg_proc p  (cost=0.00..89.30 rows=808 width=78) (actual time=0.052..53.465 rows=2402 loops=1)
               Filter: pg_function_is_visible(oid)
         ->  Hash  (cost=1.09..1.09 rows=4 width=68) (actual time=0.011..0.011 rows=4 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 1kB
               ->  Seq Scan on pg_namespace n  (cost=0.00..1.09 rows=4 width=68) (actual time=0.005..0.007 rows=4 loops=1)
                     Filter: ((nspname <> 'pg_catalog'::name) AND (nspname <> 'information_schema'::name))
                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=146.63..148.65 rows=808 width=138) (actual time=55.009..55.012 rows=71 loops=1)
   Sort Key: n.nspname, p.proname, (pg_get_function_arguments(p.oid))
   Sort Method: quicksort  Memory: 43kB
   ->  Hash Join  (cost=1.14..107.61 rows=808 width=138) (actual time=42.495..54.854 rows=71 loops=1)
         Hash Cond: (p.pronamespace = n.oid)
         ->  Seq Scan on pg_proc p  (cost=0.00..89.30 rows=808 width=78) (actual time=0.052..53.465 rows=2402 loops=1)
               Filter: pg_function_is_visible(oid)
         ->  Hash  (cost=1.09..1.09 rows=4 width=68) (actual time=0.011..0.011 rows=4 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 1kB
               ->  Seq Scan on pg_namespace n  (cost=0.00..1.09 rows=4 width=68) (actual time=0.005..0.007 rows=4 loops=1)
                     Filter: ((nspname <> 'pg_catalog'::name) AND (nspname <> 'information_schema'::name))

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Etienne Baqué

Merge request reports