Prioritize exact matches on project search
What does this MR do?
This MR prioritizes exact matches for #235676 (closed).
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Database Query
With similarity sort
Time: 681.251 ms
EXPLAIN SELECT “projects”.* FROM “projects” INNER JOIN “project_authorizations” ON “projects”.“id” = “project_authorizations”.“project_id” INNER JOIN “routes” ON “routes”.“source_type” = ‘Project’ AND “routes”.“source_id” = “projects”.“id” WHERE “project_authorizations”.“user_id” = 4473655 AND ((“routes”.“path” ILIKE ‘%gitlab-org/gitlab%’ OR “routes”.“name” ILIKE ‘%gitlab-org/gitlab%‘) OR “projects”.“description” ILIKE ‘%gitlab-org/gitlab%‘) AND “projects”.“pending_delete” = false ORDER BY (/* gitlab/database/similarity_score */ SIMILARITY(COALESCE(“projects”.“path”, ‘’), ‘gitlab-org/gitlab’) * CAST(‘1’ AS numeric)) + (/* gitlab/database/similarity_score */ SIMILARITY(COALESCE(“projects”.“name”, ‘’), ‘gitlab-org/gitlab’) * CAST(‘0.7’ AS numeric)) + (/* gitlab/database/similarity_score */ SIMILARITY(COALESCE(“projects”.“description”, ‘’), ‘gitlab-org/gitlab’) * CAST(‘0.2’ AS numeric)) DESC, “projects”.“id” DESC LIMIT 20
Plan with execution:
Limit (cost=10803.06..10803.06 rows=1 width=733) (actual time=92.755..92.760 rows=20 loops=1)
Buffers: shared hit=33603
-> Sort (cost=10803.06..10803.06 rows=1 width=733) (actual time=92.753..92.755 rows=20 loops=1)
Sort Key: ((((similarity((COALESCE(projects.path, ''::character varying))::text, 'gitlab-org/gitlab'::text) * '1'::double precision) + (similarity((COALESCE(projects.name, ''::character varying))::text, 'gitlab-org/gitlab'::text) * '0.7'::double precision)) + (similarity(COALESCE(projects.description, ''::text), 'gitlab-org/gitlab'::text) * '0.2'::double precision))) DESC, projects.id DESC
Sort Method: top-N heapsort Memory: 58kB
Buffers: shared hit=33603
-> Nested Loop (cost=1.57..10803.05 rows=1 width=733) (actual time=0.887..92.185 rows=121 loops=1)
Buffers: shared hit=33603
-> Nested Loop (cost=1.01..9184.70 rows=2519 width=729) (actual time=0.046..28.113 rows=3538 loops=1)
Buffers: shared hit=15885
-> Index Only Scan using index_project_authorizations_on_user_id_project_id_access_level on public.project_authorizations (cost=0.57..595.85 rows=2519 width=4) (actual time=0.027..2.790 rows=3540 loops=1)
Index Cond: (project_authorizations.user_id = 4473655)
Heap Fetches: 831
Buffers: shared hit=1722
-> Index Scan using projects_pkey on public.projects (cost=0.43..3.41 rows=1 width=725) (actual time=0.006..0.006 rows=1 loops=3540)
Index Cond: (projects.id = project_authorizations.project_id)
Filter: (NOT projects.pending_delete)
Rows Removed by Filter: 0
Buffers: shared hit=14163
-> Index Scan using index_routes_on_source_type_and_source_id on public.routes (cost=0.56..0.63 rows=1 width=52) (actual time=0.016..0.016 rows=0 loops=3538)
Index Cond: (((routes.source_type)::text = 'Project'::text) AND (routes.source_id = projects.id))
Filter: (((routes.path)::text ~~* '%gitlab-org/gitlab%'::text) OR ((routes.name)::text ~~* '%gitlab-org/gitlab%'::text) OR (projects.description ~~* '%gitlab-org/gitlab%'::text))
Rows Removed by Filter: 1
Buffers: shared hit=17718
Summary:
Time: 681.251 ms
- planning: 14.537 ms
- execution: 666.714 ms
- I/O read: 525.132 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 19930 (~155.70 MiB) from the buffer pool
- reads: 13679 (~106.90 MiB) from the OS file cache, including disk I/O
- dirtied: 462 (~3.60 MiB)
- writes: 0
Without similarity sort
Time: 269.743 ms
EXPLAIN SELECT “projects”.* FROM “projects” INNER JOIN “project_authorizations” ON “projects”.“id” = “project_authorizations”.“project_id” INNER JOIN “routes” ON “routes”.“source_type” = ‘Project’ AND “routes”.“source_id” = “projects”.“id” WHERE “project_authorizations”.“user_id” = 4473655 AND ((“routes”.“path” ILIKE ‘%gitlab-org/gitlab%’ OR “routes”.“name” ILIKE ‘%gitlab-org/gitlab%‘) OR “projects”.“description” ILIKE ‘%gitlab-org/gitlab%’) AND “projects”.“pending_delete” = false ORDER BY “projects”.“id” DESC LIMIT 20
Plan with execution
Limit (cost=1.57..10803.03 rows=1 width=725) (actual time=11.814..252.311 rows=20 loops=1)
Buffers: shared hit=4285 read=3304 dirtied=135
I/O Timings: read=211.069
-> Nested Loop (cost=1.57..10803.03 rows=1 width=725) (actual time=11.813..252.292 rows=20 loops=1)
Buffers: shared hit=4285 read=3304 dirtied=135
I/O Timings: read=211.069
-> Nested Loop (cost=1.01..9184.70 rows=2519 width=729) (actual time=0.791..161.352 rows=757 loops=1)
Buffers: shared hit=1890 read=1904 dirtied=71
I/O Timings: read=144.766
-> Index Only Scan using index_project_authorizations_on_user_id_project_id_access_level on public.project_authorizations (cost=0.57..595.85 rows=2519 width=4) (actual time=0.528..65.284 rows=757 loops=1)
Index Cond: (project_authorizations.user_id = 4473655)
Heap Fetches: 471
Buffers: shared hit=247 read=519 dirtied=24
I/O Timings: read=61.951
-> Index Scan using projects_pkey on public.projects (cost=0.43..3.41 rows=1 width=725) (actual time=0.124..0.124 rows=1 loops=757)
Index Cond: (projects.id = project_authorizations.project_id)
Filter: (NOT projects.pending_delete)
Rows Removed by Filter: 0
Buffers: shared hit=1643 read=1385 dirtied=47
I/O Timings: read=82.815
-> Index Scan using index_routes_on_source_type_and_source_id on public.routes (cost=0.56..0.63 rows=1 width=52) (actual time=0.118..0.118 rows=0 loops=757)
Index Cond: (((routes.source_type)::text = 'Project'::text) AND (routes.source_id = projects.id))
Filter: (((routes.path)::text ~~* '%gitlab-org/gitlab%'::text) OR ((routes.name)::text ~~* '%gitlab-org/gitlab%'::text) OR (projects.description ~~* '%gitlab-org/gitlab%'::text))
Rows Removed by Filter: 1
Buffers: shared hit=2395 read=1400 dirtied=64
I/O Timings: read=66.303
Summary
Time: 269.743 ms
- planning: 17.257 ms
- execution: 252.486 ms
- I/O read: 211.069 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 4285 (~33.50 MiB) from the buffer pool
- reads: 3304 (~25.80 MiB) from the OS file cache, including disk I/O
- dirtied: 135 (~1.10 MiB)
- writes: 0
Edited by Jonathan Schafer