Skip to content

Negated filtering with non-existing label produces very slow query plans

When issues are filtered with a non-existing label, a very slow query plan is produced.

It's best to start with an example to illustrate the problem.

Suppose we want to filter for issues that don't have idontexist label and this idontexist label does NOT exist within gitlab-org group.

A corresponding issues list url with filters would be:

https://gitlab.com/groups/gitlab-org/-/issues?label_name%5B%5D=section%3A%3Adev&label_name%5B%5D=security&not%5Blabel_name%5D%5B%5D=idontexist

Visiting the above url may produce a query like:

SELECT issues.*
FROM issues
	INNER JOIN projects ON projects.id = issues.project_id
	LEFT JOIN project_features ON projects.id = project_features.project_id
WHERE (
		NOT EXISTS (
			SELECT 1
			FROM banned_users
			WHERE (issues.author_id = banned_users.user_id)
		)
	)
	AND (
		issues.confidential IS NOT TRUE
		OR (
			issues.confidential = TRUE
			AND (
				issues.author_id = 5961404
				OR EXISTS (
					SELECT TRUE
					FROM issue_assignees
					WHERE user_id = 5961404
						AND issue_id = issues.id
				)
				OR EXISTS (
					SELECT 1
					FROM project_authorizations
					WHERE project_authorizations.user_id = 5961404
						AND (
							project_authorizations.project_id = issues.project_id
						)
						AND (project_authorizations.access_level >= 20)
				)
			)
		)
	)
	AND projects.namespace_id IN (
		SELECT namespaces.id
		FROM namespaces
		WHERE (traversal_ids @> ({9970}))
	)
	AND (
		EXISTS (
			SELECT 1
			FROM project_authorizations
			WHERE project_authorizations.user_id = 5961404
				AND (project_authorizations.project_id = projects.id)
				AND (project_authorizations.access_level >= 10)
		)
		OR projects.visibility_level IN (10, 20)
	)
	AND (
		project_features.issues_access_level IS NULL
		OR project_features.issues_access_level IN (20, 30)
		OR (
			project_features.issues_access_level = 10
			AND EXISTS (
				SELECT 1
				FROM project_authorizations
				WHERE project_authorizations.user_id = 5961404
					AND (project_authorizations.project_id = projects.id)
					AND (project_authorizations.access_level >= 10)
			)
		)
	)
	AND (issues.state_id IN (1))
	AND projects.archived = FALSE
	AND EXISTS (
		SELECT label_links.*
		FROM label_links
		WHERE label_links.target_type = Issue
			AND label_links.target_id = issues.id
			AND label_links.label_id = 14918378
	)
	AND EXISTS (
		SELECT label_links.*
		FROM label_links
		WHERE label_links.target_type = Issue
			AND label_links.target_id = issues.id
			AND label_links.label_id IN (2779335, 3482026)
	)
	AND NOT (
		EXISTS (
			SELECT label_links.*
			FROM label_links
			WHERE label_links.target_type = Issue
				AND label_links.target_id = issues.id
				AND 1 = 0
		)
	)
	AND issues.issue_type IN (0, 1)
	AND projects.archived = FALSE
ORDER BY issues.created_at DESC,
	issues.id DESC
LIMIT 20 OFFSET 0;

Note one of the conditional clauses in the query:

	AND NOT (
		EXISTS (
			SELECT label_links.*
			FROM label_links
			WHERE label_links.target_type = Issue
				AND label_links.target_id = issues.id
				AND 1 = 0
		)
	)

This is equivalent to AND NOT (FALSE) but using the former produces a very slow query plan (the subquery for retrieving project ids gets materialized and nested under a loop):

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7091/commands/25076#visualize-depesz

If we use AND NOT (FALSE), the query plan is much faster:

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7091/commands/25077

Workarounds

Do not use deleted/non-existing labels as filters.

Edited by euko