Backfill sharding key on resource_label_events
What does this MR do and why?
- Backfill
resource_label_eventstable with correct sharding key based on the issuable (issue, merge_request, epic) - Fix invalid records (records that have both
epic_idandissue_idset) during backfill (source of the error fixed in !196618 (merged)) -
Add syncAdding in Add resource_label_events.namespace_id sync index (!197983 - merged) so we can start the review on this one before the weekend. We don't need the index for the backfill as we are not introducing the FK yet.namespace_idindex -
Add NOT VALID FK on namespace_id columnNot adding Fk for safety on the move feature as copying existing records could potentially cause errors if existing records are not backfilled first. Adding and validating FK constraint after backfill is finalized in %18.4
DB review
Clean epic_id records that also had issue_id set
https://console.postgres.ai/gitlab/gitlab-production-main/sessions/41544/commands/127575
WITH relation AS (
SELECT
"resource_label_events".*
FROM
"resource_label_events"
WHERE
"resource_label_events"."id" BETWEEN 419550600
AND 419559999
AND "resource_label_events"."id" >= 419550600
AND "resource_label_events"."id" < 419550800
LIMIT
200
), filtered_relation AS (
SELECT
*
FROM
relation
WHERE
(
num_nonnulls(epic_id, issue_id) = 2
)
LIMIT
200
)
UPDATE
"resource_label_events"
SET
"issue_id" = NULL
FROM
"filtered_relation"
WHERE
"resource_label_events"."id" = "filtered_relation"."id"
Set namespace_id for issue related records
https://console.postgres.ai/gitlab/gitlab-production-main/sessions/41544/commands/127578
WITH relation AS (
SELECT
"resource_label_events".*
FROM
"resource_label_events"
WHERE
"resource_label_events"."id" BETWEEN 419550600
AND 419559999
AND "resource_label_events"."id" >= 419550600
AND "resource_label_events"."id" < 419550800
LIMIT
200
), filtered_relation AS (
SELECT
*
FROM
relation
WHERE
issue_id IS NOT NULL
LIMIT
200
)
UPDATE
"resource_label_events"
SET
"namespace_id" = "issues"."namespace_id"
FROM
"filtered_relation"
INNER JOIN "issues" ON "filtered_relation"."issue_id" = "issues"."id"
WHERE
"resource_label_events"."id" = "filtered_relation"."id"
Set namespace_id for merge request related records
https://console.postgres.ai/gitlab/gitlab-production-main/sessions/41544/commands/127579
WITH relation AS (
SELECT
"resource_label_events".*
FROM
"resource_label_events"
WHERE
"resource_label_events"."id" BETWEEN 419550600
AND 419559999
AND "resource_label_events"."id" >= 419550600
AND "resource_label_events"."id" < 419550800
LIMIT
200
), filtered_relation AS (
SELECT
*
FROM
relation
WHERE
merge_request_id IS NOT NULL
LIMIT
200
)
UPDATE
"resource_label_events"
SET
"namespace_id" = "projects"."project_namespace_id"
FROM
filtered_relation
INNER JOIN merge_requests ON filtered_relation.merge_request_id = merge_requests.id
INNER JOIN projects ON projects.id = merge_requests.target_project_id
WHERE
"resource_label_events"."id" = "filtered_relation"."id"
Set namespace_id for epic related records
https://console.postgres.ai/gitlab/gitlab-production-main/sessions/41544/commands/127580
WITH relation AS (
SELECT
"resource_label_events".*
FROM
"resource_label_events"
WHERE
"resource_label_events"."id" BETWEEN 419550600
AND 419559999
AND "resource_label_events"."id" >= 419550600
AND "resource_label_events"."id" < 419550800
LIMIT
200
), filtered_relation AS (
SELECT
*
FROM
relation
WHERE
epic_id IS NOT NULL
LIMIT
200
)
UPDATE
"resource_label_events"
SET
"namespace_id" = "epics"."group_id"
FROM
filtered_relation
INNER JOIN epics ON filtered_relation.epic_id = epics.id
WHERE
"resource_label_events"."id" = "filtered_relation"."id"
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #514594 (closed)
Edited by Mario Celi