Patch batched migrations using locks on CTE
Summary
The following query CTE1:
WITH "batched_relation" AS MATERIALIZED
(
SELECT "epics".* FROM "epics"
WHERE "epics"."id" BETWEEN 21 AND 30
AND "epics"."id" >= 21 AND "epics"."id" < 23
)
SELECT "epics".* FROM "batched_relation"
AS "epics"
FOR UPDATE
This does not lock the epics
. While the test is paused inside the transaction I can open gdk psql -d gitlabhq_test
and I can still update the epics
.
At first I thought this was because we are using MATERIALIZED
because a materialized table is a whole different table. So there would be no way for that to actually lock the original table. But even removing MATERIALIZED
(which can be done by passing materialized: false
to CTE.new
) did not fix the problem.
Instead we needed a normal SELECT FROM epics
statement.
So changing the query to smth like this(CTE2) would accomplish the lock we are after:
WITH "batched_relation" AS MATERIALIZED
(
SELECT "epics".* FROM "epics"
WHERE "epics"."id" BETWEEN 21 AND 30
AND "epics"."id" >= 21 AND "epics"."id" < 23 FOR UPDATE
)
SELECT "epics".* FROM "batched_relation" AS "epics"
This locking approach is used in a couple batched migrations:
- https://gitlab.com/gitlab-org/gitlab/-/blob/483cf5abb22d7973c5d6e252cc4c4d289be95f1f/lib/gitlab/background_migration/backfill_epic_basic_fields_to_work_item_record.rb#L6-6
- https://gitlab.com/gitlab-org/gitlab/-/blob/483cf5abb22d7973c5d6e252cc4c4d289be95f1f/lib/gitlab/background_migration/backfill_work_item_hierarchy_for_epics.rb#L5-5
Steps to reproduce
Watch the following recording from @DylanGriffith
https://gitlab.com/-/project/278964/uploads/12a8674620903e6f3d97c6d3ef5f4876/2024-06-03_15-40-27.mkv or bellow steps:
- Open 2 postresql command lines
- In first command line open a transaction and run the above
CTE1
query - While the transaction is still running in first command line try to update the same queries in the second postresql command line. E.g. change epic titles
- Note that the update succeeds in the second command line.
Use the same steps with CTE2
to check that locking works as expected: