Skip to content

Add active_migrations_distinct_on_table scope

What does this MR do and why?

This MR is another step towards Execute batched background migrations in parallel (#372316 - closed).

It adds new scope active_migrations_distinct_on_table to fetch most recent active batched migrations distinct on table, so that we can execute them in parallel, but avoid having migrations targeting the same table.

Also update the existing active_migration to call the new scope with limit: 1, but keep its tests to ensure there is no change in behavior. Once this is deployed, batched migrations execution should continue as usual, job transitions can be monitored here - https://log.gprd.gitlab.net/goto/4e56e250-fe6d-11ec-8656-f5f2137823ba.

Once we are ready to start executing migrations in parallel we will switch to using only active_migrations_distinct_on_table, and control the limit value with feature flag or application setting.

Database review

Here's the execution plan for the new query introduced - https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13136/commands/46039. It's not super optimal, but given it's not much different than what we currently do, and the very small size of the batched_background_migrations table, I think it's good enough.

SQL

SELECT
	"batched_background_migrations".*
FROM
	"batched_background_migrations"
WHERE
	"batched_background_migrations"."id" IN (
		SELECT
			DISTINCT ON (table_name) id
		FROM
			"batched_background_migrations"
		WHERE
			"batched_background_migrations"."gitlab_schema" IN (
				'gitlab_main',
				'gitlab_shared',
				'gitlab_internal'
			)
			AND ("batched_background_migrations"."status" IN (1))
			AND (
				on_hold_until IS NULL
				OR on_hold_until < NOW()
			)
		ORDER BY
			"batched_background_migrations"."table_name" ASC,
			"batched_background_migrations"."id" ASC
	)
ORDER BY
	"batched_background_migrations"."id" ASC
LIMIT
	1

Execution plan

 Limit  (cost=4.31..8.72 rows=1 width=198) (actual time=4.349..4.352 rows=1 loops=1)
   Buffers: shared hit=34 read=6 dirtied=2
   I/O Timings: read=3.951 write=0.000
   ->  Merge Semi Join  (cost=4.31..17.53 rows=3 width=198) (actual time=4.348..4.351 rows=1 loops=1)
         Merge Cond: (batched_background_migrations.id = "ANY_subquery".id)
         Buffers: shared hit=34 read=6 dirtied=2
         I/O Timings: read=3.951 write=0.000
         ->  Index Scan using batched_background_migrations_pkey on public.batched_background_migrations  (cost=0.14..13.15 rows=68 width=198) (actual time=1.577..3.054 rows=66 loops=1)
               Buffers: shared hit=26 read=4 dirtied=1
               I/O Timings: read=2.797 write=0.000
         ->  Sort  (cost=4.17..4.18 rows=3 width=8) (actual time=1.280..1.281 rows=1 loops=1)
               Sort Key: "ANY_subquery".id
               Sort Method: quicksort  Memory: 25kB
               Buffers: shared hit=8 read=2 dirtied=1
               I/O Timings: read=1.154 write=0.000
               ->  Subquery Scan on ANY_subquery  (cost=4.10..4.15 rows=3 width=8) (actual time=1.259..1.264 rows=3 loops=1)
                     Buffers: shared hit=8 read=2 dirtied=1
                     I/O Timings: read=1.154 write=0.000
                     ->  Unique  (cost=4.10..4.12 rows=3 width=21) (actual time=1.257..1.261 rows=3 loops=1)
                           Buffers: shared hit=8 read=2 dirtied=1
                           I/O Timings: read=1.154 write=0.000
                           ->  Sort  (cost=4.10..4.11 rows=3 width=21) (actual time=1.256..1.257 rows=3 loops=1)
                                 Sort Key: batched_background_migrations_1.table_name, batched_background_migrations_1.id
                                 Sort Method: quicksort  Memory: 25kB
                                 Buffers: shared hit=8 read=2 dirtied=1
                                 I/O Timings: read=1.154 write=0.000
                                 ->  Index Scan using index_batched_background_migrations_on_status on public.batched_background_migrations batched_background_migrations_1  (cost=0.14..4.08 rows=3 width=21) (actual time=0.633..1.210 rows=3 loops=1)
                                       Index Cond: (batched_background_migrations_1.status = 1)
                                       Filter: ((batched_background_migrations_1.gitlab_schema = ANY ('{gitlab_main,gitlab_shared,gitlab_internal}'::text[])) AND ((batched_background_migrations_1.on_hold_until IS NULL) OR (batched_background_migrations_1.on_hold_until < now())))
                                       Rows Removed by Filter: 0
                                       Buffers: shared hit=5 read=2 dirtied=1
                                       I/O Timings: read=1.154 write=0.000

Statistics

Time: 5.565 ms
  - planning: 1.089 ms
  - execution: 4.476 ms
    - I/O read: 3.951 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 34 (~272.00 KiB) from the buffer pool
  - reads: 6 (~48.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 2 (~16.00 KiB)
  - writes: 0

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #372316 (closed)

Edited by Krasimir Angelov

Merge request reports