Migrate mentions for epics and notes.noteable_type = 'Epic'

Number of rows affected

Epic with mentions in description or title epic(~696), and mentions in epic notes(~8,532):

explain select * from epics where description like '%\@%'
Seq Scan on public.epics  (cost=0.00..5731.54 rows=383 width=780) (actual time=3.214..724.875 rows=696 loops=1)
   Filter: (epics.description ~~ '%\@%'::text)
   Rows Removed by Filter: 12072
   Buffers: shared dirtied=52 hit=1063 read=1881
explain select * from notes where noteable_type = 'Epic' and note like '%\@%'
 Index Scan using index_notes_on_noteable_id_and_noteable_type on public.notes  (cost=0.57..4105657.57 rows=4266 width=2455) (actual time=4.954..685897.784 rows=8532 loops=1)
   Index Cond: ((notes.noteable_type)::text = 'Epic'::text)
   Filter: (notes.note ~~ '%\@%'::text)
   Rows Removed by Filter: 116200
   Buffers: shared hit=774523 read=568275

Migrate epic mentions

Query plans for sql queries to migrate mentions for epic description and title based on database-lab data.

  • Query 1: Get the start ID for the epics to be migrated. Plan: https://explain.depesz.com/s/K6KF

    EXPLAIN SELECT  "epics"."id" FROM "epics" 
    LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id 
    WHERE ((description like '%@%' OR title like '%@%') AND epic_user_mentions.epic_id is null) 
    ORDER BY "epics"."id" ASC LIMIT 1
  • Query 2: Get max id for the 10K batch rows for epics to be migrated. Plan: https://explain.depesz.com/s/8IBp

    EXPLAIN SELECT  epics.id FROM epics 
    LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id 
    WHERE ((description like %@% OR title like %@%) AND epic_user_mentions.epic_id is null) 
    AND epics.id >= 1 ORDER BY epics.id ASC LIMIT 1 OFFSET 10000
  • Query 3: Get epics to be migrated between min id and max id. Plan: https://explain.depesz.com/s/VSrf

    EXPLAIN SELECT MIN(epics.id), MAX(epics.id) FROM epics 
    LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id 
    WHERE ((description like %@% OR title like %@%) AND epic_user_mentions.epic_id is null) 
    AND epics.id >= 1

Migrate epic notes mentions

Query plans for sql queries to migrate mentions for epic notes based on database-lab data.

  • On notes queries are very slow even when notes.id is between range of values. For instance bellow query only gets 66 matching records in a range of ~10M notes, and it takes ~15mins to execute: https://explain.depesz.com/s/zY31

    explain SELECT "notes"."id" FROM "notes" 
    LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id 
    WHERE (note LIKE '%@%' AND epic_user_mentions.epic_id IS NULL 
    AND notes.noteable_type = 'Epic' AND notes.system = false) 
    AND notes.id >= 81310000 AND notes.id < 91320000 ORDER BY notes.id
  • Creating a temporary index helps quite a bit:

    exec CREATE INDEX epic_mentions_temp_index ON notes (id) 
    WHERE note ~~ %@%::text AND notes.noteable_type = Epic AND notes.system = false
    
    The query has been executed. Duration: 37.239 min

    \di+ epic_mentions_temp_index

    List of relations
    Schema |      Name       | Type  | Owner  | Table |  Size  | Description 
    --------+-----------------+-------+--------+-------+--------+-------------
    public | epic_mentions_temp_index | index | gitlab | notes | 216 kB | 
    (1 row)
  • Afterwards the above query took only ~0.4ms to execute: https://explain.depesz.com/s/qel3

    explain SELECT "notes"."id" FROM "notes" 
    LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id 
    WHERE (note LIKE '%@%' AND epic_user_mentions.epic_id IS NULL 
    AND notes.noteable_type = 'Epic' AND notes.system = false) 
    AND notes.id >= 81310000 AND notes.id < 91320000 ORDER BY notes.id
  • We will need to drop the epic_mentions_temp_index index in a different release after background migrations are finished.

Migration Runtime estimates

Epic title and description mentions: ~2mins
  1. Epics with mentions in title or description count: ~800 https://explain.depesz.com/s/spYu
EXPLAIN SELECT  epics.id FROM epics
LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id
WHERE ((description like %@% OR title like %@%) AND epic_user_mentions.epic_id is null) ORDER BY epics.id

(actual time=0.057..1994.760 rows=792 loops=1)
  1. Given count of epics in 1, all the queries from !22333 (comment 265837848) execute in ~2 secs
  2. Given count of epics in 1 a single background job will be ran, which locally for 10K records runs in ~2.5 minutes, so for this case it would be under 2 mins.
Epic notes mentions: ~60 mins
  1. Concurrent Index creation: ~60 mins
  2. Epic notes with mentions count: ~9000 https://explain.depesz.com/s/B9m8
explain SELECT notes.id FROM notes
LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id
WHERE (note LIKE %@% AND epic_user_mentions.epic_id IS NULL
AND notes.noteable_type = Epic AND notes.system = false) AND notes.id >= 1 ORDER BY notes.id

(actual time=15.051..16.518 rows=9276 loops=1)
  1. Given count of epic notes in 2, all the queries from !22333 (comment 265838096) would execute in 1-2 seconds.
  2. Given count of epic notes in 2 a single background job should be need that locally takes ~2.5 mins