Similarity sorting in group projects API
What does this MR do?
Related issue: #221043 (closed)
This MR introduces similarity search based on the incoming search parameter for the group projects API endpoint. The goal is to show better matches on top when using the projects dropdown.
Note: The change is behind a feature flag. I plan to use this feature in a follow up issue for the project dropdown at https://gitlab.com/groups/gitlab-org/-/analytics/value_stream_analytics
Query
Current: orders by last_activity_at DESC
API endpoint: api/v4/groups/9970/projects.json?search=git&per_page=50&with_shared=false&order_by=last_activity_at&include_subgroups=true
SELECT "projects".*
FROM "projects"
WHERE "projects"."namespace_id" IN (WITH recursive "base_and_descendants" AS (
(
SELECT "namespaces".*
FROM "namespaces"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."id" = 9970)
UNION
(
SELECT "namespaces".*
FROM "namespaces",
"base_and_descendants"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."parent_id" = "base_and_descendants"."id"))SELECT "id"
FROM "base_and_descendants" AS "namespaces")
AND (
EXISTS
(
SELECT 1
FROM "project_authorizations"
WHERE "project_authorizations"."user_id" = 4156052
AND (
project_authorizations.project_id = projects.id))
OR projects.visibility_level IN (0,10,20))
AND ((
"projects"."path" ilike '%git%'
OR "projects"."name" ilike '%git%')
OR "projects"."description" ilike '%git%')
AND "projects"."pending_delete" = false
ORDER BY "projects"."last_activity_at" DESC,
"projects"."id" DESC limit 50 offset 0
Top 10 results:
name | path
-----------------------+-----------------------
gitter-android-app | gitter-android-app
GitLab | gitlab
gitaly | gitaly
gitlab-runner | gitlab-runner
Pajamas Design System | design.gitlab.com
GitLab UX Research | ux-research
triage-ops | triage-ops
customers-gitlab-com | customers-gitlab-com
omnibus-gitlab-mirror | omnibus-gitlab-mirror
omnibus-gitlab | omnibus-gitlab
New: Order by similarity DESC
The only difference is the ORDER BY
expression.
API endpoint: api/v4/groups/9970/projects.json?search=git&per_page=50&with_shared=false&order_by=similarity&include_subgroups=true
SELECT "projects".*
FROM "projects"
WHERE "projects"."namespace_id" IN (WITH recursive "base_and_descendants" AS (
(
SELECT "namespaces".*
FROM "namespaces"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."id" = 9970)
UNION
(
SELECT "namespaces".*
FROM "namespaces",
"base_and_descendants"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."parent_id" = "base_and_descendants"."id"))SELECT "id"
FROM "base_and_descendants" AS "namespaces")
AND (
EXISTS
(
SELECT 1
FROM "project_authorizations"
WHERE "project_authorizations"."user_id" = 4156052
AND (
project_authorizations.project_id = projects.id))
OR projects.visibility_level IN (0,10,20))
AND ((
"projects"."path" ilike '%git%'
OR "projects"."name" ilike '%git%')
OR "projects"."description" ilike '%git%')
AND "projects"."pending_delete" = false
ORDER BY (similarity(COALESCE("projects"."path", ''), 'git') * cast('1' AS numeric)) + (similarity(COALESCE("projects"."name", ''), 'git') * cast('0.7' AS numeric)) + (similarity(COALESCE("projects"."description", ''), 'git') * cast('0.2' AS numeric)) DESC,
"projects"."id" DESC limit 50 offset 0
Plan - A few ms overhead because of the similarity calculation.
Top 10 results:
name | path
-----------------+-----------------
Git | git
GitLab-Git | gitlab-git
gitlab_git | gitlab_git
----- | -----
----- | -----
gitaly | gitaly
GitLab | gitlab
gitlab-grit | gitlab-grit
gitlab-git-test | gitlab-git-test
GitLab Chart | gitlab
Removed 2 private projects from the list.
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team