Skip to content

Simplify database queries in MembersFinder and improve performance for Projects::ProjectMembersController#index

Andreas Brandl requested to merge 41461-project-members-slow-due-to-sql into master

What does this MR do?

This change simplifies database queries in MembersFinder. In addition, it fixes a N+1 query problem for the member -> user queries.

The original query looked like this and took 10258ms on GitLab.com.

SELECT "members".*
FROM "members"
LEFT JOIN "users" ON "members"."user_id" = "users"."id"
WHERE (
		members.id IN (
			SELECT "members"."id"
			FROM "members"
			WHERE "members"."source_type" = 'Project'
				AND "members"."type" IN ('ProjectMember')
				AND "members"."source_id" = 13083
				AND "members"."source_type" = 'Project'
				AND "members"."type" IN ('ProjectMember')
				AND "members"."requested_at" IS NULL
				AND "members"."invite_token" IS NULL
			)
		OR members.id IN (
			SELECT "members"."id"
			FROM "members"
			WHERE "members"."source_type" = 'Namespace'
				AND "members"."type" IN ('GroupMember')
				AND "members"."source_id" = 9970
				AND "members"."source_type" = 'Namespace'
				AND "members"."type" IN ('GroupMember')
				AND "members"."requested_at" IS NULL
				AND (
					"members"."user_id" NOT IN (
						SELECT "members"."user_id"
						FROM "members"
						WHERE "members"."source_type" = 'Project'
							AND "members"."type" IN ('ProjectMember')
							AND "members"."source_id" = 13083
							AND "members"."source_type" = 'Project'
							AND "members"."type" IN ('ProjectMember')
							AND "members"."requested_at" IS NULL
							AND "members"."invite_token" IS NULL
							AND ("members"."user_id" IS NOT NULL)
						)
					)
				AND "members"."invite_token" IS NULL
			)
		)
ORDER BY users.name ASC NULLS LAST LIMIT 20 OFFSET 0

The new query (for PostgreSQL) leverages DISTINCT ON and executes in 11ms:

SELECT "members".*
FROM (
	SELECT DISTINCT ON (user_id) member_union.*
	FROM (
		SELECT "members".*
		FROM "members"
		WHERE "members"."source_type" = 'Project'
			AND "members"."type" IN ('ProjectMember')
			AND "members"."source_id" = 13083
			AND "members"."source_type" = 'Project'
			AND "members"."type" IN ('ProjectMember')
			AND "members"."requested_at" IS NULL
			AND "members"."invite_token" IS NULL
		
		UNION ALL
		
		SELECT "members".*
		FROM "members"
		WHERE "members"."source_type" = 'Namespace'
			AND "members"."type" IN ('GroupMember')
			AND "members"."source_id" = 9970
			AND "members"."source_type" = 'Namespace'
			AND "members"."type" IN ('GroupMember')
			AND "members"."requested_at" IS NULL
			AND "members"."invite_token" IS NULL
		) AS member_union
	ORDER BY user_id
		,CASE 
			WHEN type = 'ProjectMember'
				THEN 1
			WHEN type = 'GroupMember'
				THEN 2
			ELSE 3
			END
	) AS members
LEFT JOIN "users" ON "members"."user_id" = "users"."id"
ORDER BY users.name ASC NULLS LAST LIMIT 20 OFFSET 0;

Are there points in the code the reviewer needs to double check?

I feel like the translation of DISTINCT ON to MySQL is really poor. With window functions, this would be fairly straight forward to do in MySQL (ROW_NUMBER() OVER (...) = 1), but AFAIK we still need to support an old version of MySQL which doesn't have window functions (right?).

Is there a better alternative to translate this to MySQL?

Why was this MR needed?

Commonly-used controller with

  • mean: 4s
  • p95: 11s
  • p99: 13s
  • calls per 24h: 8,000

Does this MR meet the acceptance criteria?

What are the relevant issue numbers?

Closes #41461 (closed)

Edited by Yorick Peterse

Merge request reports