Skip to content

Optimize `standardRoles.membersCount` query

What does this MR do and why?

The Roles and Permissions page is failing to load for large group hierarchies due to a poorly performing SQL query. The analysis of the poorly performing query can be found here.

This MR attempts to optimize this query to work efficiently for large group hierarchies with a lot of members. The group id used in the examples below have 5052258 members records.

Before:

SELECT
  "members"."access_level",
  COUNT(*) AS members_count
FROM
  (
    (
      SELECT
        "members"."id",
        "members"."access_level",
        "members"."source_id",
        "members"."source_type",
        "members"."user_id",
        "members"."notification_level",
        "members"."type",
        "members"."created_at",
        "members"."updated_at",
        "members"."created_by_id",
        "members"."invite_email",
        "members"."invite_token",
        "members"."invite_accepted_at",
        "members"."requested_at",
        "members"."expires_at",
        "members"."ldap",
        "members"."override",
        "members"."invite_email_success",
        "members"."state",
        "members"."member_namespace_id",
        "members"."member_role_id",
        "members"."expiry_notified_at",
        "members"."request_accepted_at",
        "members"."last_activity_on"
      FROM
        "members"
      WHERE
        "members"."source_type" = 'Namespace'
        AND "members"."source_id" IN (
          SELECT
            "namespaces"."id"
          FROM
            "namespaces"
          WHERE
            "namespaces"."type" = 'Group'
            AND (traversal_ids @> ('{70960922}'))
        )
    )
    UNION
    (
      SELECT
        "members"."id",
        "members"."access_level",
        "members"."source_id",
        "members"."source_type",
        "members"."user_id",
        "members"."notification_level",
        "members"."type",
        "members"."created_at",
        "members"."updated_at",
        "members"."created_by_id",
        "members"."invite_email",
        "members"."invite_token",
        "members"."invite_accepted_at",
        "members"."requested_at",
        "members"."expires_at",
        "members"."ldap",
        "members"."override",
        "members"."invite_email_success",
        "members"."state",
        "members"."member_namespace_id",
        "members"."member_role_id",
        "members"."expiry_notified_at",
        "members"."request_accepted_at",
        "members"."last_activity_on"
      FROM
        "members"
      WHERE
        "members"."source_type" = 'Project'
        AND "members"."source_id" IN (
          SELECT
            "projects"."id"
          FROM
            "projects"
          WHERE
            "projects"."namespace_id" IN (
              SELECT
                namespaces.traversal_ids [array_length(namespaces.traversal_ids, 1)] AS id
              FROM
                "namespaces"
              WHERE
                "namespaces"."type" = 'Group'
                AND (traversal_ids @> ('{70960922}'))
            )
        )
    )
  ) members
WHERE
  "members"."member_role_id" IS NULL
GROUP BY
  "members"."access_level";
Time: 3.023 min
  - planning: 6.193 ms
  - execution: 3.023 min
    - I/O read: 2.560 min
    - I/O write: 1.673 s

Shared buffers:
  - hits: 986002 (~7.50 GiB) from the buffer pool
  - reads: 1298012 (~9.90 GiB) from the OS file cache, including disk I/O
  - dirtied: 136392 (~1.00 GiB)
  - writes: 65579 (~512.30 MiB)

Temp buffers:
  - reads: 128412 (~1003.20 MiB)
  - writes: 187185 (~1.40 GiB)

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/32671/commands/100775

SELECT "members"."access_level", COUNT(DISTINCT user_id) AS users_count
FROM
(
  (
    SELECT
      "members"."id",
      "members"."access_level",
      "members"."source_id",
      "members"."source_type",
      "members"."user_id",
      "members"."notification_level",
      "members"."type",
      "members"."created_at",
      "members"."updated_at",
      "members"."created_by_id",
      "members"."invite_email",
      "members"."invite_token",
      "members"."invite_accepted_at",
      "members"."requested_at",
      "members"."expires_at",
      "members"."ldap",
      "members"."override",
      "members"."state",
      "members"."invite_email_success",
      "members"."member_namespace_id",
      "members"."member_role_id",
      "members"."expiry_notified_at",
      "members"."request_accepted_at"
    FROM "members"
    WHERE "members"."source_type" = 'Namespace'
      AND "members"."source_id" IN (
        SELECT "namespaces"."id"
        FROM "namespaces"
        WHERE "namespaces"."type" = 'Group'
          AND (traversal_ids @> ('{70960922}'))
      )
  )
  UNION
  (
    SELECT
      "members"."id",
      "members"."access_level",
      "members"."source_id",
      "members"."source_type",
      "members"."user_id",
      "members"."notification_level",
      "members"."type",
      "members"."created_at",
      "members"."updated_at",
      "members"."created_by_id",
      "members"."invite_email",
      "members"."invite_token",
      "members"."invite_accepted_at",
      "members"."requested_at",
      "members"."expires_at",
      "members"."ldap",
      "members"."override",
      "members"."state",
      "members"."invite_email_success",
      "members"."member_namespace_id",
      "members"."member_role_id",
      "members"."expiry_notified_at",
      "members"."request_accepted_at"
    FROM "members"
    WHERE "members"."source_type" = 'Project'
      AND "members"."source_id" IN (
        SELECT "projects"."id"
        FROM "projects"
        WHERE "projects"."namespace_id" IN (
            SELECT namespaces.traversal_ids [array_length(namespaces.traversal_ids, 1)] AS id
            FROM "namespaces"
            WHERE "namespaces"."type" = 'Group'
              AND (traversal_ids @> ('{70960922}'))
          )
      )
  )
) members
WHERE "members"."member_role_id" IS NULL
GROUP BY "members"."access_level"
Time: 5.078 min
  - planning: 5.905 ms
  - execution: 5.078 min
    - I/O read: 4.794 min
    - I/O write: 84.811 ms

Shared buffers:
  - hits: 959160 (~7.30 GiB) from the buffer pool
  - reads: 1342687 (~10.20 GiB) from the OS file cache, including disk I/O
  - dirtied: 69667 (~544.30 MiB)
  - writes: 6773 (~52.90 MiB)

Temp buffers:
  - reads: 96183 (~751.40 MiB)
  - writes: 139946 (~1.10 GiB)

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/33289/commands/102710

After:

SELECT COUNT(*) AS "count_all", "members"."access_level" AS "members_access_level"
FROM
(
  (
    SELECT "members"."id", "members"."access_level"
    FROM "members"
    WHERE "members"."member_role_id" IS NULL
      AND "members"."source_id" IN (
        SELECT namespaces.traversal_ids [array_length(namespaces.traversal_ids, 1)] AS id
        FROM "namespaces"
        WHERE "namespaces"."type" = 'Group'
          AND (traversal_ids @> ('{70960922}'))
      )
      AND "members"."source_type" = 'Namespace'
  )
  UNION ALL (
    SELECT "members"."id", "members"."access_level"
    FROM "members"
    WHERE "members"."member_role_id" IS NULL
      AND "members"."source_id" IN (
        SELECT "projects"."id"
        FROM "projects"
        WHERE "projects"."namespace_id" IN (
            SELECT namespaces.traversal_ids [array_length(namespaces.traversal_ids, 1)] AS id
            FROM "namespaces"
            WHERE "namespaces"."type" = 'Group'
              AND (traversal_ids @> ('{70960922}'))
          )
      )
      AND "members"."source_type" = 'Project'
  )
) members
GROUP BY "members"."access_level"
Time: 10.023 s
  - planning: 147.680 ms
  - execution: 9.876 s
    - I/O read: 7.594 s
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 400999 (~3.10 GiB) from the buffer pool
  - reads: 91776 (~717.00 MiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/33194/commands/102530

SELECT
  COUNT(DISTINCT(user_id)) AS "count_distinct_user_id",
  "members"."access_level" AS "members_access_level"
FROM
(
  (
    SELECT
      "members"."id",
      "members"."access_level",
      "members"."user_id"
    FROM "members"
    WHERE "members"."member_role_id" IS NULL
      AND "members"."source_id" IN (
        SELECT namespaces.traversal_ids [array_length(namespaces.traversal_ids, 1)] AS id
        FROM "namespaces"
        WHERE "namespaces"."type" = 'Group'
          AND (traversal_ids @> ('{70960922}'))
      )
      AND "members"."source_type" = 'Namespace'
  )
  UNION ALL (
    SELECT
      "members"."id",
      "members"."access_level",
      "members"."user_id"
    FROM "members"
    WHERE "members"."member_role_id" IS NULL
      AND "members"."source_id" IN (
        SELECT "projects"."id"
        FROM "projects"
        WHERE "projects"."namespace_id" IN (
            SELECT namespaces.traversal_ids [array_length(namespaces.traversal_ids, 1)] AS id
            FROM "namespaces"
            WHERE "namespaces"."type" = 'Group'
              AND (traversal_ids @> ('{70960922}'))
          )
      )
      AND "members"."source_type" = 'Project'
  )
) members
GROUP BY "members"."access_level"
Time: 43.498 s
  - planning: 60.124 ms
  - execution: 43.438 s
    - I/O read: 30.997 s
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 942324 (~7.20 GiB) from the buffer pool
  - reads: 1259129 (~9.60 GiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Temp buffers:
  - reads: 9156 (~71.50 MiB)
  - writes: 9158 (~71.50 MiB)

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/33289/commands/102720

#498688 (closed)

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Before After
Time: 3.023 min Time: 10.023 s
Time: 5.078 min Time: 43.498 s

How to set up and validate locally

Edited by mo khan

Merge request reports

Loading