Overly complex SQL doing large hash join spills to disk can probably be optimized to use indexes
This SQL seems overly complex and in particular it's doing a large hash join that spills to disk. At a quick glance that large hash join seems to be a small part of the query and can probably be optimized using indexes to not be accessing such a large set of records.
SELECT "members".*
FROM "members"
LEFT JOIN "users" ON "members"."user_id" = "users"."id"
WHERE "members"."type" IN ('GroupMember')
AND "members"."source_type" = 'Namespace'
AND (
members.id IN (
SELECT "members"."id"
FROM "members"
WHERE "members"."source_type" = 'Namespace'
AND "members"."type" IN ('GroupMember')
AND "members"."source_id" = 2763806
AND "members"."source_type" = 'Namespace'
AND "members"."type" IN ('GroupMember')
AND "members"."requested_at" IS NULL
)
OR members.id IN (
SELECT "members"."id"
FROM "members"
WHERE "members"."type" IN ('GroupMember')
AND "members"."source_type" = 'Namespace'
AND "members"."requested_at" IS NULL
AND "members"."source_id" IN (
WITH RECURSIVE "base_and_ancestors" AS (
SELECT "namespaces".*
FROM "namespaces"
WHERE "namespaces"."type" IN ('Group')
AND "namespaces"."id" = 1779761
UNION
SELECT "namespaces".*
FROM "namespaces"
,"base_and_ancestors"
WHERE "namespaces"."type" IN ('Group')
AND "namespaces"."id" = "base_and_ancestors"."parent_id"
)
SELECT "id"
FROM "base_and_ancestors" AS "namespaces"
)
AND (
"members"."user_id" NOT IN (
SELECT "users"."id"
FROM "users"
INNER JOIN "members" ON "users"."id" = "members"."user_id"
WHERE "members"."source_type" = 'Namespace'
AND "members"."source_id" = 2763806
AND "members"."source_type" = 'Namespace'
AND "members"."type" IN ('GroupMember')
AND "members"."requested_at" IS NULL
)
)
)
)
ORDER BY users.name ASC NULLS LAST LIMIT 50 OFFSET 0;
Limit (cost=440775.47..440775.60 rows=50 width=180) (actual time=4168.805..4168.812 rows=7 loops=1)
Output: members.id, members.access_level, members.source_id, members.source_type, members.user_id, members.notification_level, members.type, members.created_at, members.updated_at, members.created_by_id, members.invite_email, members.invite_token, members.invite_accepted_at, members.requested_at, members.expires_at, members.ldap, members.override, users.name
Buffers: shared hit=1166717 dirtied=6, temp read=2437 written=9087
-> Sort (cost=440775.47..441228.71 rows=181296 width=180) (actual time=4168.802..4168.805 rows=7 loops=1)
Output: members.id, members.access_level, members.source_id, members.source_type, members.user_id, members.notification_level, members.type, members.created_at, members.updated_at, members.created_by_id, members.invite_email, members.invite_token, members.invite_accepted_at, members.requested_at, members.expires_at, members.ldap, members.override, users.name
Sort Key: users.name
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1166717 dirtied=6, temp read=2437 written=9087
-> Hash Left Join (cost=223682.27..434752.95 rows=181296 width=180) (actual time=3304.853..4168.672 rows=7 loops=1)
Output: members.id, members.access_level, members.source_id, members.source_type, members.user_id, members.notification_level, members.type, members.created_at, members.updated_at, members.created_by_id, members.invite_email, members.invite_token, members.invite_accepted_at, members.requested_at, members.expires_at, members.ldap, members.override, users.name
Hash Cond: (members.user_id = users.id)
Buffers: shared hit=1166714 dirtied=6, temp read=2437 written=9087
-> Index Scan using index_members_on_source_id_and_source_type on public.members (cost=618.61..189639.43 rows=181296 width=168) (actual time=1043.669..1836.099 rows=7 loops=1)
Output: members.id, members.access_level, members.source_id, members.source_type, members.user_id, members.notification_level, members.type, members.created_at, members.updated_at, members.created_by_id, members.invite_email, members.invite_token, members.invite_accepted_at, members.requested_at, members.expires_at, members.ldap, members.override
Index Cond: ((members.source_type)::text = 'Namespace'::text)
Filter: (((members.type)::text = 'GroupMember'::text) AND ((hashed SubPlan 1) OR (hashed SubPlan 4)))
Rows Removed by Filter: 1155974
Buffers: shared hit=1004883 dirtied=6
SubPlan 1
-> Index Scan using index_members_on_source_id_and_source_type on public.members members_1 (cost=0.43..4.47 rows=1 width=4) (actual time=0.038..0.046 rows=2 loops=1)
Output: members_1.id
Index Cond: ((members_1.source_id = 2763806) AND ((members_1.source_type)::text = 'Namespace'::text))
Filter: ((members_1.requested_at IS NULL) AND ((members_1.type)::text = 'GroupMember'::text))
Buffers: shared hit=5
SubPlan 4
-> Nested Loop (cost=360.92..613.68 rows=5 width=4) (actual time=0.204..0.252 rows=5 loops=1)
Output: members_3.id
Buffers: shared hit=26
-> HashAggregate (cost=352.55..352.76 rows=21 width=4) (actual time=0.052..0.052 rows=1 loops=1)
Output: namespaces_2.id
Group Key: namespaces_2.id
Buffers: shared hit=4
-> CTE Scan on base_and_ancestors namespaces_2 (cost=351.87..352.29 rows=21 width=4) (actual time=0.045..0.050 rows=1 loops=1)
Output: namespaces_2.id
Buffers: shared hit=4
CTE base_and_ancestors
-> Recursive Union (cost=0.43..351.87 rows=21 width=183) (actual time=0.043..0.047 rows=1 loops=1)
Buffers: shared hit=4
-> Index Scan using namespaces_pkey on public.namespaces (cost=0.43..3.45 rows=1 width=183) (actual time=0.037..0.038 rows=1 loops=1)
Output: namespaces.id, namespaces.name, namespaces.path, namespaces.owner_id, namespaces.created_at, namespaces.updated_at, namespaces.type, namespaces.description, namespaces.avatar, namespaces.membership_lock, namespaces.share_with_group_lock, namespaces.visibility_level, namespaces.request_access_enabled, namespaces.ldap_sync_status, namespaces.ldap_sync_error, namespaces.ldap_sync_last_update_at, namespaces.ldap_sync_last_successful_update_at, namespaces.ldap_sync_last_sync_at, namespaces.lfs_enabled, namespaces.description_html, namespaces.parent_id, namespaces.shared_runners_minutes_limit, namespaces.repository_size_limit, namespaces.require_two_factor_authentication, namespaces.two_factor_grace_period, namespaces.cached_markdown_version, namespaces.plan_id, namespaces.project_creation_level
Index Cond: (namespaces.id = 1779761)
Filter: ((namespaces.type)::text = 'Group'::text)
Buffers: shared hit=4
-> Nested Loop (cost=0.43..34.80 rows=2 width=183) (actual time=0.003..0.003 rows=0 loops=1)
Output: namespaces_1.id, namespaces_1.name, namespaces_1.path, namespaces_1.owner_id, namespaces_1.created_at, namespaces_1.updated_at, namespaces_1.type, namespaces_1.description, namespaces_1.avatar, namespaces_1.membership_lock, namespaces_1.share_with_group_lock, namespaces_1.visibility_level, namespaces_1.request_access_enabled, namespaces_1.ldap_sync_status, namespaces_1.ldap_sync_error, namespaces_1.ldap_sync_last_update_at, namespaces_1.ldap_sync_last_successful_update_at, namespaces_1.ldap_sync_last_sync_at, namespaces_1.lfs_enabled, namespaces_1.description_html, namespaces_1.parent_id, namespaces_1.shared_runners_minutes_limit, namespaces_1.repository_size_limit, namespaces_1.require_two_factor_authentication, namespaces_1.two_factor_grace_period, namespaces_1.cached_markdown_version, namespaces_1.plan_id, namespaces_1.project_creation_level
-> WorkTable Scan on base_and_ancestors (cost=0.00..0.20 rows=10 width=4) (actual time=0.001..0.001 rows=1 loops=1)
Output: base_and_ancestors.id, base_and_ancestors.name, base_and_ancestors.path, base_and_ancestors.owner_id, base_and_ancestors.created_at, base_and_ancestors.updated_at, base_and_ancestors.type, base_and_ancestors.description, base_and_ancestors.avatar, base_and_ancestors.membership_lock, base_and_ancestors.share_with_group_lock, base_and_ancestors.visibility_level, base_and_ancestors.request_access_enabled, base_and_ancestors.ldap_sync_status, base_and_ancestors.ldap_sync_error, base_and_ancestors.ldap_sync_last_update_at, base_and_ancestors.ldap_sync_last_successful_update_at, base_and_ancestors.ldap_sync_last_sync_at, base_and_ancestors.lfs_enabled, base_and_ancestors.description_html, base_and_ancestors.parent_id, base_and_ancestors.shared_runners_minutes_limit, base_and_ancestors.repository_size_limit, base_and_ancestors.require_two_factor_authentication, base_and_ancestors.two_factor_grace_period, base_and_ancestors.cached_markdown_version, base_and_ancestors.plan_id, base_and_ancestors.project_creation_level
-> Index Scan using namespaces_pkey on public.namespaces namespaces_1 (cost=0.43..3.45 rows=1 width=183) (actual time=0.001..0.001 rows=0 loops=1)
Output: namespaces_1.id, namespaces_1.name, namespaces_1.path, namespaces_1.owner_id, namespaces_1.created_at, namespaces_1.updated_at, namespaces_1.type, namespaces_1.description, namespaces_1.avatar, namespaces_1.membership_lock, namespaces_1.share_with_group_lock, namespaces_1.visibility_level, namespaces_1.request_access_enabled, namespaces_1.ldap_sync_status, namespaces_1.ldap_sync_error, namespaces_1.ldap_sync_last_update_at, namespaces_1.ldap_sync_last_successful_update_at, namespaces_1.ldap_sync_last_sync_at, namespaces_1.lfs_enabled, namespaces_1.description_html, namespaces_1.parent_id, namespaces_1.shared_runners_minutes_limit, namespaces_1.repository_size_limit, namespaces_1.require_two_factor_authentication, namespaces_1.two_factor_grace_period, namespaces_1.cached_markdown_version, namespaces_1.plan_id, namespaces_1.project_creation_level
Index Cond: (namespaces_1.id = base_and_ancestors.parent_id)
Filter: ((namespaces_1.type)::text = 'Group'::text)
-> Index Scan using index_members_on_source_id_and_source_type on public.members members_3 (cost=8.37..12.41 rows=1 width=8) (actual time=0.150..0.194 rows=5 loops=1)
Output: members_3.id, members_3.access_level, members_3.source_id, members_3.source_type, members_3.user_id, members_3.notification_level, members_3.type, members_3.created_at, members_3.updated_at, members_3.created_by_id, members_3.invite_email, members_3.invite_token, members_3.invite_accepted_at, members_3.requested_at, members_3.expires_at, members_3.ldap, members_3.override
Index Cond: ((members_3.source_id = namespaces_2.id) AND ((members_3.source_type)::text = 'Namespace'::text))
Filter: ((members_3.requested_at IS NULL) AND (NOT (hashed SubPlan 2)) AND ((members_3.type)::text = 'GroupMember'::text))
Rows Removed by Filter: 1
Buffers: shared hit=22
SubPlan 2
-> Nested Loop (cost=0.86..7.93 rows=1 width=4) (actual time=0.050..0.082 rows=2 loops=1)
Output: users_1.id
Buffers: shared hit=13
-> Index Scan using index_members_on_source_id_and_source_type on public.members members_2 (cost=0.43..4.47 rows=1 width=4) (actual time=0.003..0.005 rows=2 loops=1)
Output: members_2.id, members_2.access_level, members_2.source_id, members_2.source_type, members_2.user_id, members_2.notification_level, members_2.type, members_2.created_at, members_2.updated_at, members_2.created_by_id, members_2.invite_email, members_2.invite_token, members_2.invite_accepted_at, members_2.requested_at, members_2.expires_at, members_2.ldap, members_2.override
Index Cond: ((members_2.source_id = 2763806) AND ((members_2.source_type)::text = 'Namespace'::text))
Filter: ((members_2.requested_at IS NULL) AND ((members_2.type)::text = 'GroupMember'::text))
Buffers: shared hit=5
-> Index Only Scan using users_pkey on public.users users_1 (cost=0.43..3.45 rows=1 width=4) (actual time=0.036..0.037 rows=1 loops=2)
Output: users_1.id
Index Cond: (users_1.id = members_2.user_id)
Heap Fetches: 0
Buffers: shared hit=8
-> Hash (cost=184192.63..184192.63 rows=2236163 width=16) (actual time=2258.672..2258.672 rows=2075304 loops=1)
Output: users.name, users.id
Buckets: 524288 Batches: 16 Memory Usage: 10448kB
Buffers: shared hit=161831, temp written=9068
-> Seq Scan on public.users (cost=0.00..184192.63 rows=2236163 width=16) (actual time=0.023..1380.273 rows=2075304 loops=1)
Output: users.name, users.id
Buffers: shared hit=161831
Planning time: 3.202 ms
Execution time: 4169.192 ms
(80 rows)
Edited by 🤖 GitLab Bot 🤖