Skip to content

Optimize User#ci_owned_runners query V3

Furkan Ayhan requested to merge 336436-v3-optimize-ci_owned_runners-query into master

What does this MR do and why?

This is a new version of !79729 (merged). It was reverted by !80225 (merged) because of a staging QA incident: gitlab-com/gl-infra/production#6315 (closed). The problem is that the user gitlab-qa / 1614863 has a very extraordinary number of groups and projects (~6-7k) while the staging environment does not have many other than those. So, the SQL query plan was not good.

We also recently tried !81248 (closed) but it was not good in production.

Then, we decided to clean up staging env, so we did: 1, 2, 3.

Then, @ayufan created !83521 (merged) to reduce the number of ids that we pass to the query.

This MR is based on that and it optimizes the query from User#ci_owned_runners by moving subqueries into CTEs and letting them use the existing index with a smarter query plan.

  • Original issue: #336436
  • Feature flag: #350322 (closed) ci_owned_runners_cross_joins_fix

Database

Required data:
SELECT "members"."source_id"
FROM "members"
WHERE "members"."source_type" = 'Project'
  AND "members"."type" = 'ProjectMember'
  AND "members"."user_id" = 1614863
  AND "members"."requested_at" IS NULL
  AND (access_level >= 40);
SELECT "namespaces"."traversal_ids"
FROM "namespaces"
INNER JOIN "members" ON "members"."source_type" = 'Namespace'
                    AND "members"."requested_at" IS NULL
                    AND "members"."source_id" = "namespaces"."id"
                    AND "members"."type" = 'GroupMember'
WHERE "namespaces"."type" = 'Group'
  AND "members"."source_type" = 'Namespace'
  AND "members"."type" = 'GroupMember'
  AND "members"."user_id" = 1614863
  AND "members"."requested_at" IS NULL
  AND (access_level >= 10)
  AND (access_level >= 40)
ORDER BY namespaces.traversal_ids;

-- ids = [result above]
-- last_prefix = [-1]
-- namespace_ids = []

-- ids.each do |prefix|
--   next if last_prefix == prefix[0..(last_prefix.count - 1)]

--   last_prefix = prefix
--   namespace_ids << prefix[-1]
-- end
SELECT "namespaces"."traversal_ids"
FROM "namespaces"
INNER JOIN "members" ON "members"."source_type" = 'Namespace'
                    AND "members"."requested_at" IS NULL
                    AND "members"."source_id" = "namespaces"."id"
                    AND "members"."type" = 'GroupMember'
WHERE "namespaces"."type" = 'Group'
  AND "members"."source_type" = 'Namespace'
  AND "members"."type" = 'GroupMember'
  AND "members"."user_id" = 1614863
  AND "members"."requested_at" IS NULL
  AND (access_level >= 10)
  AND (access_level >= 50)
ORDER BY namespaces.traversal_ids;

-- ids = [result above]
-- last_prefix = [-1]
-- namespace_ids = []

-- ids.each do |prefix|
--   next if last_prefix == prefix[0..(last_prefix.count - 1)]

--   last_prefix = prefix
--   namespace_ids << prefix[-1]
-- end

New query for ci_owned_runners

SELECT ci_runners.*
FROM (
  (
    SELECT ci_runners.*
    FROM ci_runners
    INNER JOIN ci_runner_projects runner_projects ON runner_projects.runner_id = ci_runners.id
    WHERE runner_projects.project_id IN (...ids...)
  )
  UNION
  (
    WITH cte_namespace_ids AS MATERIALIZED (
      SELECT ci_namespace_mirrors.namespace_id
      FROM ci_namespace_mirrors
      WHERE (traversal_ids && ARRAY[...ids...]::int[])
    ),
    cte_project_ids AS MATERIALIZED (
      SELECT ci_project_mirrors.project_id
      FROM ci_project_mirrors
      WHERE (ci_project_mirrors.namespace_id IN (SELECT namespace_id FROM cte_namespace_ids))
    )
    SELECT ci_runners.*
    FROM ci_runners
    INNER JOIN ci_runner_projects ON ci_runner_projects.runner_id = ci_runners.id
    WHERE (ci_runner_projects.project_id IN (SELECT project_id FROM cte_project_ids))
  )
  UNION
  (
    WITH cte_namespace_ids AS MATERIALIZED (
      SELECT ci_namespace_mirrors.namespace_id
      FROM ci_namespace_mirrors
      WHERE (traversal_ids && ARRAY[...ids...]::int[])
    )
    SELECT ci_runners.*
    FROM ci_runners
    INNER JOIN ci_runner_namespaces ON ci_runner_namespaces.runner_id = ci_runners.id
    WHERE (ci_runner_namespaces.namespace_id IN (SELECT namespace_id FROM cte_namespace_ids))
  )
) ci_runners;

Production result: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/9479/commands/33628

Other attempts:

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #336436

Edited by Furkan Ayhan

Merge request reports