Populate canonical emails
What does this MR do?
Continuation of !27722 (merged).
Background migration to generate a canonical email based on the user's primary email.
Canonical means the Agent part of the email address omitting .
and anything after any +
.
Scoped to Gmail since they are a service that allows .
and ignores anything after +
in the Agent and all variations arrive in the same inbox.
According to the query below, there are 2611059
*@gmail.com
addresses on gitlab.com.
If every minute it processes 1000 rows, and there are 2.6 million gmail addresses, that's 2600 minutes or 43 hours (just under 2 days).
DB query plan for all users with `@gmail.com` domains
Query: `explain SELECT * FROM users WHERE email LIKE ‘%gmail.com%’ ORDER BY users.id ASC LIMIT 1 OFFSET 100;`Limit (cost=24.26..24.50 rows=1 width=1202) (actual time=5.004..5.005 rows=1 loops=1)
Buffers: shared hit=156 read=5
I/O Timings: read=1.695
-> Index Scan using users_pkey on users (cost=0.43..622171.36 rows=2611059 width=1202) (actual time=0.188..4.994 rows=101 loops=1)
Filter: ((email)::text ~~ '%gmail.com%'::text)
Rows Removed by Filter: 58
Buffers: shared hit=156 read=5
I/O Timings: read=1.695
Planning time: 12.575 ms
Execution time: 5.100 ms
Related to https://gitlab.com/gitlab-org/gitlab/issues/205383
Screenshots
Does this MR meet the acceptance criteria?
Conformity
- [-] Changelog entry
- [-] Documentation (if required)
- [-] Code review guidelines
- [-] Merge request performance guidelines
-
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team
Closes #205383