Skip to content

Migrate epic mentions to epic_user_mentions

What does this MR do?

This MR contains migrations for epic and epic notes mentions.

These migrations are to be run after !19088 (merged) is merged and confirmed to be working fine on production

This MR has the migrations, the common code under CreateResourceUserMention that would be used for further migration of other models mentions as well as isolated Mentionable concern and Epic and EpicUserMention models.

CreateResourceUserMention makes use of bulk insert feature to speed up inserts of new mentions.

I've pulled some estimate counts of the records we might be looking at from slack database-lab channel.

Database benchmarks

Number of rows affected

Epic with mentions in description or title epic(~696), and mentions in epic notes(~8,532):

explain select * from epics where description like '%\@%'
Seq Scan on public.epics  (cost=0.00..5731.54 rows=383 width=780) (actual time=3.214..724.875 rows=696 loops=1)
   Filter: (epics.description ~~ '%\@%'::text)
   Rows Removed by Filter: 12072
   Buffers: shared dirtied=52 hit=1063 read=1881
explain select * from notes where noteable_type = 'Epic' and note like '%\@%'
 Index Scan using index_notes_on_noteable_id_and_noteable_type on public.notes  (cost=0.57..4105657.57 rows=4266 width=2455) (actual time=4.954..685897.784 rows=8532 loops=1)
   Index Cond: ((notes.noteable_type)::text = 'Epic'::text)
   Filter: (notes.note ~~ '%\@%'::text)
   Rows Removed by Filter: 116200
   Buffers: shared hit=774523 read=568275

Migrate epic mentions

Query plans for sql queries to migrate mentions for epic description and title based on database-lab data.

  • Query 1: Get the start ID for the epics to be migrated. Plan: https://explain.depesz.com/s/K6KF

    EXPLAIN SELECT  "epics"."id" FROM "epics" 
    LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id 
    WHERE ((description like '%@%' OR title like '%@%') AND epic_user_mentions.epic_id is null) 
    ORDER BY "epics"."id" ASC LIMIT 1
  • Query 2: Get max id for the 10K batch rows for epics to be migrated. Plan: https://explain.depesz.com/s/8IBp

    EXPLAIN SELECT  epics.id FROM epics 
    LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id 
    WHERE ((description like %@% OR title like %@%) AND epic_user_mentions.epic_id is null) 
    AND epics.id >= 1 ORDER BY epics.id ASC LIMIT 1 OFFSET 10000
  • Query 3: Get epics to be migrated between min id and max id. Plan: https://explain.depesz.com/s/VSrf

    EXPLAIN SELECT MIN(epics.id), MAX(epics.id) FROM epics 
    LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id 
    WHERE ((description like %@% OR title like %@%) AND epic_user_mentions.epic_id is null) 
    AND epics.id >= 1

Migrate epic notes mentions

Query plans for sql queries to migrate mentions for epic notes based on database-lab data.

  • On notes queries are very slow even when notes.id is between range of values. For instance bellow query only gets 66 matching records in a range of ~10M notes, and it takes ~15mins to execute: https://explain.depesz.com/s/zY31

    explain SELECT "notes"."id" FROM "notes" 
    LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id 
    WHERE (note LIKE '%@%' AND epic_user_mentions.epic_id IS NULL 
    AND notes.noteable_type = 'Epic' AND notes.system = false) 
    AND notes.id >= 81310000 AND notes.id < 91320000 ORDER BY notes.id
  • Creating a temporary index helps quite a bit:

    exec CREATE INDEX epic_mentions_temp_index ON notes (id) 
    WHERE note ~~ %@%::text AND notes.noteable_type = Epic AND notes.system = false
    
    The query has been executed. Duration: 37.239 min

    \di+ epic_mentions_temp_index

    List of relations
    Schema |      Name       | Type  | Owner  | Table |  Size  | Description 
    --------+-----------------+-------+--------+-------+--------+-------------
    public | epic_mentions_temp_index | index | gitlab | notes | 216 kB | 
    (1 row)
  • Afterwards the above query took only ~0.4ms to execute: https://explain.depesz.com/s/qel3

    explain SELECT "notes"."id" FROM "notes" 
    LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id 
    WHERE (note LIKE '%@%' AND epic_user_mentions.epic_id IS NULL 
    AND notes.noteable_type = 'Epic' AND notes.system = false) 
    AND notes.id >= 81310000 AND notes.id < 91320000 ORDER BY notes.id
  • We will need to drop the epic_mentions_temp_index index in a different release after background migrations are finished.

Migration Runtime estimates

Epic title and description mentions: ~2mins
  1. Epics with mentions in title or description count: ~800 https://explain.depesz.com/s/spYu
EXPLAIN SELECT  epics.id FROM epics
LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id
WHERE ((description like %@% OR title like %@%) AND epic_user_mentions.epic_id is null) ORDER BY epics.id

(actual time=0.057..1994.760 rows=792 loops=1)
  1. Given count of epics in 1, all the queries from !22333 (comment 265837848) execute in ~2 secs
  2. Given count of epics in 1 a single background job will be ran, which locally for 10K records runs in ~2.5 minutes, so for this case it would be under 2 mins.
Epic notes mentions: ~60 mins
  1. Concurrent Index creation: ~60 mins
  2. Epic notes with mentions count: ~9000 https://explain.depesz.com/s/B9m8
explain SELECT notes.id FROM notes
LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id
WHERE (note LIKE %@% AND epic_user_mentions.epic_id IS NULL
AND notes.noteable_type = Epic AND notes.system = false) AND notes.id >= 1 ORDER BY notes.id

(actual time=15.051..16.518 rows=9276 loops=1)
  1. Given count of epic notes in 2, all the queries from !22333 (comment 265838096) would execute in 1-2 seconds.
  2. Given count of epic notes in 2 a single background job should be need that locally takes ~2.5 mins

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Closes #21801

Edited by Alper Akgun

Merge request reports