Optimize User#ci_owned_runners query V3
What does this MR do and why?
This is a new version of !79729 (merged). It was reverted by !80225 (merged) because of a staging QA incident: gitlab-com/gl-infra/production#6315 (closed). The problem is that the user gitlab-qa
/ 1614863
has a very extraordinary number of groups and projects (~6-7k) while the staging environment does not have many other than those. So, the SQL query plan was not good.
We also recently tried !81248 (closed) but it was not good in production.
Then, we decided to clean up staging env, so we did: 1, 2, 3.
Then, @ayufan created !83521 (merged) to reduce the number of ids that we pass to the query.
This MR is based on that and it optimizes the query from User#ci_owned_runners
by moving subqueries into CTEs and letting them use the existing index with a smarter query plan.
- Original issue: #336436
- Feature flag: #350322 (closed)
ci_owned_runners_cross_joins_fix
Database
Required data:
SELECT "members"."source_id"
FROM "members"
WHERE "members"."source_type" = 'Project'
AND "members"."type" = 'ProjectMember'
AND "members"."user_id" = 1614863
AND "members"."requested_at" IS NULL
AND (access_level >= 40);
SELECT "namespaces"."traversal_ids"
FROM "namespaces"
INNER JOIN "members" ON "members"."source_type" = 'Namespace'
AND "members"."requested_at" IS NULL
AND "members"."source_id" = "namespaces"."id"
AND "members"."type" = 'GroupMember'
WHERE "namespaces"."type" = 'Group'
AND "members"."source_type" = 'Namespace'
AND "members"."type" = 'GroupMember'
AND "members"."user_id" = 1614863
AND "members"."requested_at" IS NULL
AND (access_level >= 10)
AND (access_level >= 40)
ORDER BY namespaces.traversal_ids;
-- ids = [result above]
-- last_prefix = [-1]
-- namespace_ids = []
-- ids.each do |prefix|
-- next if last_prefix == prefix[0..(last_prefix.count - 1)]
-- last_prefix = prefix
-- namespace_ids << prefix[-1]
-- end
SELECT "namespaces"."traversal_ids"
FROM "namespaces"
INNER JOIN "members" ON "members"."source_type" = 'Namespace'
AND "members"."requested_at" IS NULL
AND "members"."source_id" = "namespaces"."id"
AND "members"."type" = 'GroupMember'
WHERE "namespaces"."type" = 'Group'
AND "members"."source_type" = 'Namespace'
AND "members"."type" = 'GroupMember'
AND "members"."user_id" = 1614863
AND "members"."requested_at" IS NULL
AND (access_level >= 10)
AND (access_level >= 50)
ORDER BY namespaces.traversal_ids;
-- ids = [result above]
-- last_prefix = [-1]
-- namespace_ids = []
-- ids.each do |prefix|
-- next if last_prefix == prefix[0..(last_prefix.count - 1)]
-- last_prefix = prefix
-- namespace_ids << prefix[-1]
-- end
New query for ci_owned_runners
SELECT ci_runners.*
FROM (
(
SELECT ci_runners.*
FROM ci_runners
INNER JOIN ci_runner_projects runner_projects ON runner_projects.runner_id = ci_runners.id
WHERE runner_projects.project_id IN (...ids...)
)
UNION
(
WITH cte_namespace_ids AS MATERIALIZED (
SELECT ci_namespace_mirrors.namespace_id
FROM ci_namespace_mirrors
WHERE (traversal_ids && ARRAY[...ids...]::int[])
),
cte_project_ids AS MATERIALIZED (
SELECT ci_project_mirrors.project_id
FROM ci_project_mirrors
WHERE (ci_project_mirrors.namespace_id IN (SELECT namespace_id FROM cte_namespace_ids))
)
SELECT ci_runners.*
FROM ci_runners
INNER JOIN ci_runner_projects ON ci_runner_projects.runner_id = ci_runners.id
WHERE (ci_runner_projects.project_id IN (SELECT project_id FROM cte_project_ids))
)
UNION
(
WITH cte_namespace_ids AS MATERIALIZED (
SELECT ci_namespace_mirrors.namespace_id
FROM ci_namespace_mirrors
WHERE (traversal_ids && ARRAY[...ids...]::int[])
)
SELECT ci_runners.*
FROM ci_runners
INNER JOIN ci_runner_namespaces ON ci_runner_namespaces.runner_id = ci_runners.id
WHERE (ci_runner_namespaces.namespace_id IN (SELECT namespace_id FROM cte_namespace_ids))
)
) ci_runners;
Production result: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/9479/commands/33628
Other attempts:
- no CTE: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/9479/commands/33625
- full CTE: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/9479/commands/33624
- Minimum CTE with JOIN: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/9479/commands/33622
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #336436