Migrate mentions for epics and notes.noteable_type = 'Epic'
Number of rows affected
Epic with mentions in description or title epic(~696), and mentions in epic notes(~8,532):
explain select * from epics where description like '%\@%'
Seq Scan on public.epics (cost=0.00..5731.54 rows=383 width=780) (actual time=3.214..724.875 rows=696 loops=1)
Filter: (epics.description ~~ '%\@%'::text)
Rows Removed by Filter: 12072
Buffers: shared dirtied=52 hit=1063 read=1881
explain select * from notes where noteable_type = 'Epic' and note like '%\@%'
Index Scan using index_notes_on_noteable_id_and_noteable_type on public.notes (cost=0.57..4105657.57 rows=4266 width=2455) (actual time=4.954..685897.784 rows=8532 loops=1)
Index Cond: ((notes.noteable_type)::text = 'Epic'::text)
Filter: (notes.note ~~ '%\@%'::text)
Rows Removed by Filter: 116200
Buffers: shared hit=774523 read=568275
Migrate epic mentions
Query plans for sql queries to migrate mentions for epic description and title based on database-lab
data.
-
Query 1: Get the start ID for the epics to be migrated. Plan: https://explain.depesz.com/s/K6KF
EXPLAIN SELECT "epics"."id" FROM "epics" LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id WHERE ((description like '%@%' OR title like '%@%') AND epic_user_mentions.epic_id is null) ORDER BY "epics"."id" ASC LIMIT 1
-
Query 2: Get max id for the 10K batch rows for epics to be migrated. Plan: https://explain.depesz.com/s/8IBp
EXPLAIN SELECT “epics”.“id” FROM “epics” LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id WHERE ((description like ‘%@%’ OR title like ‘%@%’) AND epic_user_mentions.epic_id is null) AND “epics”.“id” >= 1 ORDER BY “epics”.“id” ASC LIMIT 1 OFFSET 10000
-
Query 3: Get epics to be migrated between min id and max id. Plan: https://explain.depesz.com/s/VSrf
EXPLAIN SELECT MIN(epics.id), MAX(epics.id) FROM “epics” LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id WHERE ((description like ‘%@%’ OR title like ‘%@%’) AND epic_user_mentions.epic_id is null) AND “epics”.“id” >= 1
Migrate epic notes mentions
Query plans for sql queries to migrate mentions for epic notes based on database-lab
data.
-
On notes queries are very slow even when notes.id is between range of values. For instance bellow query only gets 66 matching records in a range of ~10M notes, and it takes ~15mins to execute: https://explain.depesz.com/s/zY31
explain SELECT "notes"."id" FROM "notes" LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id WHERE (note LIKE '%@%' AND epic_user_mentions.epic_id IS NULL AND notes.noteable_type = 'Epic' AND notes.system = false) AND notes.id >= 81310000 AND notes.id < 91320000 ORDER BY notes.id
-
Creating a temporary index helps quite a bit:
exec CREATE INDEX epic_mentions_temp_index ON notes (id) WHERE note ~~ ‘%@%’::text AND notes.noteable_type = ‘Epic’ AND notes.system = false The query has been executed. Duration: 37.239 min
\di+ epic_mentions_temp_index
List of relations Schema | Name | Type | Owner | Table | Size | Description --------+-----------------+-------+--------+-------+--------+------------- public | epic_mentions_temp_index | index | gitlab | notes | 216 kB | (1 row)
-
Afterwards the above query took only ~0.4ms to execute: https://explain.depesz.com/s/qel3
explain SELECT "notes"."id" FROM "notes" LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id WHERE (note LIKE '%@%' AND epic_user_mentions.epic_id IS NULL AND notes.noteable_type = 'Epic' AND notes.system = false) AND notes.id >= 81310000 AND notes.id < 91320000 ORDER BY notes.id
-
We will need to drop the
epic_mentions_temp_index
index in a different release after background migrations are finished.
Migration Runtime estimates
~2mins
Epic title and description mentions:- Epics with mentions in title or description count: ~800 https://explain.depesz.com/s/spYu
EXPLAIN SELECT “epics”.“id” FROM “epics”
LEFT JOIN epic_user_mentions on epics.id = epic_user_mentions.epic_id
WHERE ((description like ‘%@%’ OR title like ‘%@%’) AND epic_user_mentions.epic_id is null) ORDER BY “epics”.“id”
(actual time=0.057..1994.760 rows=792 loops=1)
- Given count of epics in 1, all the queries from !22333 (comment 265837848) execute in ~2 secs
- Given count of epics in 1 a single background job will be ran, which locally for 10K records runs in ~2.5 minutes, so for this case it would be under 2 mins.
~60 mins
Epic notes mentions:- Concurrent Index creation: ~60 mins
- Epic notes with mentions count: ~9000 https://explain.depesz.com/s/B9m8
explain SELECT “notes”.“id” FROM “notes”
LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id
WHERE (note LIKE ‘%@%’ AND epic_user_mentions.epic_id IS NULL
AND notes.noteable_type = ‘Epic’ AND notes.system = false) AND notes.id >= 1 ORDER BY notes.id
(actual time=15.051..16.518 rows=9276 loops=1)
- Given count of epic notes in 2, all the queries from !22333 (comment 265838096) would execute in 1-2 seconds.
- Given count of epic notes in 2 a single background job should be need that locally takes ~2.5 mins