Sync attributes between `users` and `user_details`
What does this MR do?
This MR sets up a new table where the rarely used (mostly visible on user's profile) string attributes on users
are going to be moved to user_details
.
This MR contains:
- Creation of
user_details
table. - Set up
INSERT
andUPDATE
trigger onusers
to sync the values, respecting the newly established validation (LIMIT
) rules onuser_details
. - Background migration to copy the column data to the
user_details
table.
Users to migrate: < 400_000 (specialized index is included)
UserDetail limits:
column | default | limit | reasoning |
---|---|---|---|
bio | '' |
255 | existing limit on User model |
organization | '' |
255 | it was unlimited, couldn't find meaningful organization in the PRD DB (only spam) above 255 |
location | '' |
255 | it was unlimited, couldn't find meaningful location in the PRD DB (only spam) above 255 |
'' |
2048 | it was unlimited, setting max browser URL length limit | |
'' |
2048 | it was unlimited, setting max browser URL length limit | |
skype | '' |
2048 | it was unlimited, setting max browser URL length limit |
website_url | '' |
2048 | it was unlimited, setting max browser URL length limit |
Migration rule: Only migrate records when one of the columns are are NOT NULL
OR ''
:
- bio
- location
- organization
- skype
- website_url
BG migration query:
INSERT INTO user_details
(
user_id,
bio,
location,
organization,
linkedin,
twitter,
skype,
website_url
)
SELECT id AS user_id,
substring(COALESCE(bio, '') from 1 for 255) AS bio,
substring(COALESCE(location, '') from 1 for 255) AS location,
substring(COALESCE(organization, '') from 1 for 255) AS orgainzation,
substring(COALESCE(linkedin, '') from 1 for 255) AS linkedin,
substring(COALESCE(twitter, '') from 1 for 255) AS twitter,
substring(COALESCE(skype, '') from 1 for 255) AS skype,
substring(COALESCE(website_url, '') from 1 for 255) AS website_url
FROM "users"
WHERE ((
COALESCE(bio, '') IS distinct FROM '')
OR (
COALESCE(location, '') IS DISTINCT FROM '')
OR (
COALESCE(organization, '') IS DISTINCT FROM '')
OR (
COALESCE(linkedin, '') IS DISTINCT FROM '')
OR (
COALESCE(twitter, '') IS DISTINCT FROM '')
OR (
COALESCE(skype, '') IS DISTINCT FROM '')
OR (
COALESCE(website_url, '') IS DISTINCT FROM ''))
AND "users"."id" BETWEEN 1 AND 40449
ON conflict (user_id) do UPDATE
set "bio" = excluded."bio",
"location" = excluded."location",
"organization" = excluded."organization",
"linkedin" = excluded."linkedin",
"twitter" = excluded."twitter",
"skype" = excluded."skype",
"website_url" = excluded."website_url";
- Plan is for batch size: 1_000
- Around 1s, since the data is not part of the index, we could reduce the batch size to 500.
- Estimated execution time (1_000 batch): ~ 15 hours
- Estimated execution time (500 batch): ~ 30 hours
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Closes #206913 (closed)
Edited by Chad Woolley