Migrate mentions to respective DB tables
What does this MR do?
This MR contains a collection of migrations for mentions on various item: issues, epics, merge requests, notes.
This MR is one(most probably first) of a series of mentions migrations.
These migrations are to be run after !19088 (merged) is merged and confirmed to be working fine on production
This MR is mainly the implementation of the common code lib under CreateResourceUserMention
that would be used for further migration of various chunks of mentions.
We plan on migrating mentions in chunks for the simple reason that the number of mentions that need to be migrated is very large, roughly 22M records in various tables.
CreateResourceUserMention
makes use of bulk insert feature to speed up inserts of new mentions.
I've pulled some count from slack database-lab
channel.
- All issues vs open issues that may have a mention in description: ~1,155,214 vs ~341,481
-- issues that possibly have a mention in description
explain select * from issues where description like '%\@%'
Seq Scan on issues (cost=0.00..7324876.67 rows=1155214 width=775)
Filter: (description ~~ '%\@%'::text)
-- Open issues that possibly have a mention in description
explain select * from issues where state_id = 1 and description like '%\@%'
Index Scan using idx_issues_on_state_id on issues (cost=0.44..3330471.90 rows=341481 width=775)
Index Cond: (state_id = 1)
Filter: (description ~~ '%\@%'::text)
- All issue notes vs open issues notes that may have a mention: ~7,592,623 vs ~2,150,714
-- issue notes that have at least one mention.
explain select * from notes where noteable_type = 'Issue' and note like '%\@%'
Index Scan using index_notes_on_noteable_id_and_noteable_type on notes (cost=0.57..38323797.08 rows=7592623 width=2455)
Index Cond: ((noteable_type)::text = 'Issue'::text)
Filter: (note ~~ '%\@%'::text)
explain select * from notes, issues where issues.id = notes.noteable_id AND issues.state_id = 1 AND noteable_type = 'Issue' and note like '%\@%'
Merge Join (cost=86.98..33798499.89 rows=2150714 width=3230)
Merge Cond: (notes.noteable_id = issues.id)
-> Index Scan using index_notes_on_noteable_id_and_noteable_type on notes (cost=0.57..38323797.08 rows=7592623 width=2455)
Index Cond: ((noteable_type)::text = 'Issue'::text)
Filter: (note ~~ '%\@%'::text)
- All merge requests vs open merge requests that may have a mention in description: ~816,573 vs ~40,099
explain select * from merge_requests where description like '%\@%'
Seq Scan on merge_requests (cost=0.00..8260111.89 rows=816573 width=713)
Filter: (description ~~ '%\@%'::text)
explain select * from merge_requests where state_id = 1 and description like '%\@%'
Index Scan using idx_merge_requests_on_target_project_id_and_iid_opened on merge_requests (cost=0.43..1920706.01 rows=40099 width=713)
Filter: (description ~~ '%\@%'::text)
- All merge request notes vs open merge request notes that may have a mention: ~9,663,665 vs ~474,550
explain select * from notes, merge_requests where merge_requests.id = notes.noteable_id AND noteable_type = 'MergeRequest' and note like '%\@%'
Merge Join (cost=34.51..48430159.16 rows=9663665 width=3168)
Merge Cond: (notes.noteable_id = merge_requests.id)
-> Index Scan using index_notes_on_noteable_id_and_noteable_type on notes (cost=0.57..42041324.54 rows=10084526 width=2455)
Index Cond: ((noteable_type)::text = 'MergeRequest'::text)
Filter: (note ~~ '%\@%'::text)
explain select * from notes, merge_requests where merge_requests.id = notes.noteable_id AND state_id =1 AND noteable_type = 'MergeRequest' and note like '%\@%'
Nested Loop (cost=1.00..30561364.85 rows=474550 width=3168)
-> Index Scan using idx_merge_requests_on_target_project_id_and_iid_opened on merge_requests (cost=0.43..1916559.80 rows=1658485 width=713)
-> Index Scan using index_notes_on_noteable_id_and_noteable_type on notes (cost=0.57..17.25 rows=2 width=2455)
Index Cond: ((noteable_id = merge_requests.id) AND ...
- Not much that we can do on commit notes though: ~770,166
explain select * from notes where noteable_type = 'Commit' and note like '%\@%'
Index Scan using index_notes_on_noteable_id_and_noteable_type on notes (cost=0.57..14299350.09 rows=770166 width=2455)
Index Cond: ((noteable_type)::text = 'Commit'::text)
Filter: (note ~~ '%\@%'::text)
- Not too worried about epic(~696), epic notes(~8,532), snippet(~4,366), snippet notes(~2,257), design notes(~406) mentions:
explain select * from epics where description like '%\@%'
Seq Scan on public.epics (cost=0.00..5731.54 rows=383 width=780) (actual time=3.214..724.875 rows=696 loops=1)
Filter: (epics.description ~~ '%\@%'::text)
Rows Removed by Filter: 12072
Buffers: shared dirtied=52 hit=1063 read=1881
explain select * from notes where noteable_type = 'Epic' and note like '%\@%'
Index Scan using index_notes_on_noteable_id_and_noteable_type on public.notes (cost=0.57..4105657.57 rows=4266 width=2455) (actual time=4.954..685897.784 rows=8532 loops=1)
Index Cond: ((notes.noteable_type)::text = 'Epic'::text)
Filter: (notes.note ~~ '%\@%'::text)
Rows Removed by Filter: 116200
Buffers: shared hit=774523 read=568275
explain select * from snippets where description like '%\@%'
Seq Scan on public.snippets (cost=0.00..124116.62 rows=2044 width=1080) (actual time=23.523..18530.015 rows=4366 loops=1)
Filter: (snippets.description ~~ '%\@%'::text)
Rows Removed by Filter: 216908
Buffers: shared dirtied=1176 hit=34223 read=45467
explain select * from notes where noteable_type = 'Snippet' and note like '%\@%'
Index Scan using index_notes_on_noteable_id_and_noteable_type on public.notes (cost=0.57..4105657.57 rows=4266 width=2455) (actual time=2.047..617768.467 rows=2257 loops=1)
Index Cond: ((notes.noteable_type)::text = 'Snippet'::text)
Filter: (notes.note ~~ '%\@%'::text)
Rows Removed by Filter: 20493
Buffers: shared hit=766129 read=481730
explain select * from notes where noteable_type = 'DesignManagement::Design' and note like '%\@%'
Index Scan using index_notes_on_noteable_id_and_noteable_type on public.notes (cost=0.57..4105657.57 rows=4266 width=2455) (actual time=2.020..544961.579 rows=406 loops=1)
Index Cond: ((notes.noteable_type)::text = 'DesignManagement::Design'::text)
Filter: (notes.note ~~ '%\@%'::text)
Rows Removed by Filter: 1125
Buffers: shared hit=526138 read=698230 written=3
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Closes #21801