Skip to content

Geo - Support filtering job artifact registries by keyword via GraphQL

What does this MR do and why?

Currently, for self-service replicables, we are unable to do any filtering on the Geo Replicable views.

This MR provides the base of this functionality and supports filtering job artifact registries by keyword (fuzzy_search) via GraphQL:

Registry Searchable attributes
Job Artifacts project :name, :description and :path

Database

A new query introduced in Geo::JobArtifactRegistry using with_search inherited method from Geo::BaseRegistry:

where(self::MODEL_FOREIGN_KEY => self::MODEL_CLASS.search(query).limit(1000).pluck_primary_key)

For Geo::JobArtifactRegistry it's equivalent to:

where(artifact_id: ::Ci::JobArtifact.search(query).limit(1000).pluck_primary_key)

and search method for ::Ci::JobArtifact:

where(project_id: ::Project.search(query).limit(1000).pluck_primary_key)

and search method for ::Project:

fuzzy_search(query, [:path, :name, :description])

Raw SQL and query plans

Using:

Geo::JobArtifactRegistry.with_search('GitLab')

Then:

SELECT
    "projects"."id"
FROM
    "projects"
WHERE (("projects"."path" ILIKE '%GitLab%'
        OR "projects"."name" ILIKE '%GitLab%')
    OR "projects"."description" ILIKE '%GitLab%')
LIMIT 1000

Query plan

 Limit  (cost=0.00..2019.29 rows=1000 width=4) (actual time=12.508..164.210 rows=1000 loops=1)
   Buffers: shared read=1182 dirtied=76 written=70
   I/O Timings: read=90.169 write=1.522
   ->  Seq Scan on public.projects  (cost=0.00..5163188.34 rows=2556938 width=4) (actual time=12.506..164.017 rows=1000 loops=1)
         Filter: (((projects.path)::text ~~* '%GitLab%'::text) OR ((projects.name)::text ~~* '%GitLab%'::text) OR (projects.description ~~* '%GitLab%'::text))
         Rows Removed by Filter: 29396
         Buffers: shared read=1182 dirtied=76 written=70
         I/O Timings: read=90.169 write=1.522

Then:

SELECT
    "ci_job_artifacts"."project_id",
    "ci_job_artifacts"."file_type",
    "ci_job_artifacts"."size",
    "ci_job_artifacts"."created_at",
    "ci_job_artifacts"."updated_at",
    "ci_job_artifacts"."expire_at",
    "ci_job_artifacts"."file",
    "ci_job_artifacts"."file_store",
    "ci_job_artifacts"."file_sha256",
    "ci_job_artifacts"."file_format",
    "ci_job_artifacts"."file_location",
    "ci_job_artifacts"."id",
    "ci_job_artifacts"."job_id",
    "ci_job_artifacts"."locked",
    "ci_job_artifacts"."partition_id"
FROM
    "ci_job_artifacts"
WHERE
    "ci_job_artifacts"."project_id" = 4

Query plan

 Index Scan using index_ci_job_artifacts_on_id_project_id_and_created_at on public.ci_job_artifacts  (cost=0.15..3.17 rows=1 width=147) (actual time=0.003..0.003 rows=0 loops=1)
   Index Cond: (ci_job_artifacts.project_id = 4)
   Buffers: shared hit=2
   I/O Timings: read=0.000 write=0.000
SELECT
    "job_artifact_registry"."id",
    "job_artifact_registry"."created_at",
    "job_artifact_registry"."retry_at",
    "job_artifact_registry"."bytes",
    "job_artifact_registry"."artifact_id",
    "job_artifact_registry"."retry_count",
    "job_artifact_registry"."sha256",
    "job_artifact_registry"."missing_on_primary",
    "job_artifact_registry"."state",
    "job_artifact_registry"."last_synced_at",
    "job_artifact_registry"."last_sync_failure",
    "job_artifact_registry"."verified_at",
    "job_artifact_registry"."verification_started_at",
    "job_artifact_registry"."verification_retry_at",
    "job_artifact_registry"."verification_state",
    "job_artifact_registry"."verification_retry_count",
    "job_artifact_registry"."verification_checksum",
    "job_artifact_registry"."verification_checksum_mismatched",
    "job_artifact_registry"."checksum_mismatch",
    "job_artifact_registry"."verification_failure"
FROM
    "job_artifact_registry"
WHERE
    "job_artifact_registry"."artifact_id" IN (61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80)

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #364722 (closed) and #411770

Edited by Javiera Tapia

Merge request reports