Skip to content

Migrate mentions from design notes to DB table

What does this MR do?

This MR contains migrations for design notes mentions.

These migrations are to be run after !19088 (merged) is merged and confirmed to be working fine on production

This MR is based on initial migration that contains common code under CreateResourceUserMention to be used on all mentions migrations of other models, including DesignManagement::Design.

I've pulled some estimate counts of the records we might be looking at from slack database-lab channel.

Database benchmarks

Number of rows affected

Design notes with mentions ~700

explain select id from notes where noteable_type = DesignManagement::Design and note like %\@%

Index Scan using index_notes_on_noteable_id_and_noteable_type on public.notes  (cost=0.57..4519075.97 rows=3936 width=4) (actual time=7.361..952352.359 rows=652 loops=1)
   Index Cond: ((notes.noteable_type)::text = 'DesignManagement::Design'::text)
   Filter: (notes.note ~~ '%\@%'::text)
   Rows Removed by Filter: 1535
   Buffers: shared hit=107 read=1362816 dirtied=17
   I/O Timings: read=938952.612

Migrate design notes mentions

Query plans for sql queries to migrate mentions for design notes based on database-lab data.

  • On notes queries are very slow even when notes.id is between range of values. For instance bellow query only gets 66 matching records in a range of ~10M notes, and it takes ~18mins to execute: https://explain.depesz.com/s/oo7c

    explain SELECT notes.id FROM notes
    LEFT JOIN design_user_mentions ON notes.id = design_user_mentions.note_id
    WHERE (note LIKE %@% AND design_user_mentions.design_id IS NULL
    AND notes.noteable_type = DesignManagement::Design)
    AND notes.id >= 81310000 AND notes.id < 91320000 ORDER BY notes.id
  • Creating a temporary index helps quite a bit:

    exec CREATE INDEX CONCURRENTLY design_mentions_temp_index ON notes (id) 
    WHERE note ~~ %@%::text AND notes.noteable_type = DesignManagement::Design
    
    The query has been executed. Duration: 63.628 min

    \di+ design_mentions_temp_index

    List of relations
    Schema |            Name            | Type  | Owner  | Table | Size  | Description 
    --------+----------------------------+-------+--------+-------+-------+-------------
    public | design_mentions_temp_index | index | gitlab | notes | 40 kB | 
    (1 row)
  • Afterwards the above query took only ~0.4ms to execute

      explain SELECT notes.id FROM notes
      LEFT JOIN design_user_mentions ON notes.id = design_user_mentions.note_id
      WHERE (note LIKE %@% AND design_user_mentions.design_id IS NULL
      AND notes.noteable_type = DesignManagement::Design)
      AND notes.id >= 81310000 AND notes.id < 91320000 ORDER BY notes.id
  • We will need to drop the design_mentions_temp_index index in a different release after background migrations are finished.

Migration Runtime estimates:

Design notes mentions migration: ~60 mins
  1. Concurrent Index creation: ~60 mins
  2. Design notes with mentions count: ~700 => ~2-3 mins for the job to run
explain SELECT notes.id FROM notes
LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id
WHERE (note LIKE %@% AND epic_user_mentions.epic_id IS NULL
AND notes.noteable_type = Epic AND notes.system = false) AND notes.id >= 1 ORDER BY notes.id

Hash Left Join  (cost=4.95..14.83 rows=1 width=4) (actual time=0.093..1.283 rows=699 loops=1)
 Hash Cond: (notes.id = design_user_mentions.note_id)
 Filter: (design_user_mentions.design_id IS NULL)
 Rows Removed by Filter: 35
 Buffers: shared hit=355
 ->  Index Only Scan using design_mentions_temp_index on public.notes  (cost=0.28..9.56 rows=157 width=4) (actual time=0.048..1.098 rows=734 loops=1)
       Index Cond: (notes.id >= 1)
       Heap Fetches: 20
       Buffers: shared hit=351
 ->  Hash  (cost=4.30..4.30 rows=30 width=8) (actual time=0.019..0.019 rows=35 loops=1)
       Buckets: 1024  Batches: 1  Memory Usage: 10kB
       Buffers: shared hit=1
       ->  Seq Scan on public.design_user_mentions  (cost=0.00..4.30 rows=30 width=8) (actual time=0.006..0.009 rows=35 loops=1)
             Buffers: shared hit=1

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Closes #198325 (closed)

Edited by Coung Ngo

Merge request reports