Skip to content

Geo - Support filtering pipeline artifacts by keyword via GraphQL

What does this MR do and why?

Currently, for self-service replicables, we are unable to do any filtering on the Geo Replicable views.

This MR uses better composition to filter logic applied to current Geo replicable registries (job artifacts) and also supports filtering pipeline artifacts registries by keyword (fuzzy_search) via GraphQL:

Registry Searchable attributes
Pipeline Artifacts project :name, :description and :path

Database

Raw SQL and query plans

Using:

Ci::PipelineArtifact.search('GitLab')

Then:

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/14233/commands/49950

SELECT
    "projects"."id"
FROM
    "projects"
WHERE (("projects"."path" ILIKE '%GitLab%'
        OR "projects"."name" ILIKE '%GitLab%')
    OR "projects"."description" ILIKE '%GitLab%')
LIMIT 1000

Query plan:

 Limit  (cost=0.00..2087.49 rows=1000 width=4) (actual time=3.466..144.779 rows=1000 loops=1)
   Buffers: shared read=1167 dirtied=35 written=35
   I/O Timings: read=77.631 write=0.753
   ->  Seq Scan on public.projects  (cost=0.00..5214758.43 rows=2498096 width=4) (actual time=3.465..144.623 rows=1000 loops=1)
         Filter: (((projects.path)::text ~~* '%GitLab%'::text) OR ((projects.name)::text ~~* '%GitLab%'::text) OR (projects.description ~~* '%GitLab%'::text))
         Rows Removed by Filter: 28836
         Buffers: shared read=1167 dirtied=35 written=35
         I/O Timings: read=77.631 write=0.753

Then:

https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/14566/commands/51084

SELECT
    "ci_pipeline_artifacts".*
FROM
    "ci_pipeline_artifacts"
WHERE
    "ci_pipeline_artifacts"."project_id" IN (2, 1, 19)

Query plan:

Index Scan using index_ci_pipeline_artifacts_on_project_id on public.ci_pipeline_artifacts  (cost=0.43..130.84 rows=83 width=171) (actual time=2.069..2.070 rows=0 loops=1)  
   Index Cond: (ci_pipeline_artifacts.project_id = ANY ('{2,1,19}'::bigint[]))  
   Buffers: shared hit=9 read=3  
   I/O Timings: read=2.017 write=0.000

Statistics:

Time: 3.683 ms
  - planning: 1.572 ms
  - execution: 2.111 ms
    - I/O read: 2.017 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 9 (~72.00 KiB) from the buffer pool
  - reads: 3 (~24.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #364722 (closed) and #411770

Edited by Javiera Tapia

Merge request reports