Skip to content

Geo - Support filtering merge request diffs by keyword via GraphQL

What does this MR do and why?

Currently, for self-service replicables, we are unable to do any filtering on the Geo Replicable views.

This MR supports filtering merge request diffs registries by keyword via GraphQL:

Registry Searchable attributes
Merge Request Diffs External diff

Database

Raw SQL and query plans

Using:

MergeRequestDiff.search('diff-107')

Then:

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/15211/commands/52832

SELECT
    merge_request_diffs.*
FROM
    merge_request_diffs
WHERE
    merge_request_diffs.external_diff = 'diff-107'
LIMIT 1000;

Query plan (without index):

 Limit  (cost=1000.00..50833046.37 rows=2 width=186) (actual time=682932.737..682954.507 rows=0 loops=1)
   Buffers: shared hit=1274759 read=10905828 dirtied=85475 written=84967
   I/O Timings: read=1102452.295 write=2872.340
   ->  Gather  (cost=1000.00..50833046.37 rows=2 width=186) (actual time=682932.709..682954.477 rows=0 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=1274759 read=10905828 dirtied=85475 written=84967
         I/O Timings: read=1102452.295 write=2872.340
         ->  Parallel Seq Scan on public.merge_request_diffs  (cost=0.00..50832046.17 rows=1 width=186) (actual time=682925.760..682925.761 rows=0 loops=3)
               Filter: ((merge_request_diffs.external_diff)::text = 'diff-107'::text)
               Rows Removed by Filter: 135051599
               Buffers: shared hit=1274759 read=10905828 dirtied=85475 written=84967
               I/O Timings: read=1102452.295 write=2872.340

Statistics (without index):

Time: 11.383 min  
  - planning: 2.359 ms  
  - execution: 11.383 min  
    - I/O read: 18.374 min  
    - I/O write: 2.872 s  
  
Shared buffers:  
  - hits: 1274759 (~9.70 GiB) from the buffer pool  
  - reads: 10905828 (~83.20 GiB) from the OS file cache, including disk I/O  
  - dirtied: 85475 (~667.80 MiB)  
  - writes: 84967 (~663.80 MiB) 

Then:

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/15211/commands/52838

Query plan (with index):

 Limit  (cost=0.57..4.94 rows=2 width=186) (actual time=0.308..0.309 rows=0 loops=1)
   Buffers: shared read=4
   I/O Timings: read=0.262 write=0.000
   ->  Index Scan using index_merge_request_diffs_on_external_diff on public.merge_request_diffs  (cost=0.57..4.94 rows=2 width=186) (actual time=0.306..0.306 rows=0 loops=1)
         Index Cond: ((merge_request_diffs.external_diff)::text = 'diff-107'::text)
         Buffers: shared read=4
         I/O Timings: read=0.262 write=0.000

Statistics (with index):

Time: 2.528 ms  
  - planning: 2.181 ms  
  - execution: 0.347 ms  
    - I/O read: 0.262 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 0 from the buffer pool  
  - reads: 4 (~32.00 KiB) from the OS file cache, including disk I/O  
  - dirtied: 0  
  - writes: 0  

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #364722 (closed) and #411770

Edited by Javiera Tapia

Merge request reports