Skip to content

Save iteration sequence number

euko requested to merge 343621-separately-store-the-numbering-of-iterations into master

What does this MR do and why?

Part 1/2 of #343621 (closed) (Part 2: !75527 (closed))

  • DB change: Add a new column sequence to sprints table

  • DB change: Add a deferrable unique constraint to sprints table on iterations_cadence_id and sequence (the enforcement of the constraint is deferred until the end of a transaction to allow for a bulk update. See !74352 (comment 749969086))

  • DB update: Backfills sequence column

  • App model change: on creation, update or deletion of an iteration, the sequence numbers are checked/updated for all the iterations in the cadence containing the created/updated/deleted iteration.

Iteration cadence is being developed behind a FF :iteration_cadences

Background

An iteration cadence is a container for a sequence of iterations (i.e., an iteration cadence has many iterations and a group may have many iteration cadences.)

An iteration has a duration (period) and the iterations within a cadence are strictly sequential without overlapping periods.

While either the start or due date of an iteration can be used to derive the iteration's sequence number in a cadence, we've decided to explicitly store the sequence number in the sprints table for easier and performant referencing (useful for searching and generating default titles).

Illustration:

Cadence Iteration title Start Date End Date Sequence Number
Cadence A Iteration 1 Nov 1 Nov 5 1
" Foobar Nov 6 Nov 10 2
" Iteration 3 Nov 11 Nov 15 3
Cadence B Iteration 1 Nov 3 Nov 10 1

Migration

Up

== 20211126042235 AddSequenceColumnToSprintsTable: migrating ==================
-- add_column(:sprints, :sequence, :integer)
   -> 0.0015s
-- execute("ALTER TABLE sprints ADD CONSTRAINT sequence_is_unique_per_iterations_cadence_id UNIQUE (iterations_cadence_id, sequence) DEFERRABLE INITIALLY DEFERRED")
   -> 0.0023s
== 20211126042235 AddSequenceColumnToSprintsTable: migrated (0.0040s) =========


== 20211130165043 BackfillSequenceColumnForSprintsTable: migrating ============
-- execute("        UPDATE sprints\n        SET sequence=t.row_number\n        FROM (\n          SELECT id, row_number() OVER (PARTITION BY iterations_cadence_id ORDER BY start_date)\n          FROM sprints as s1\n          WHERE s1.iterations_cadence_id IS NOT NULL\n        ) as t\n        WHERE t.id=sprints.id AND (sprints.sequence IS NULL OR sprints.sequence <> t.row_number)\n")
   -> 0.0022s
== 20211130165043 BackfillSequenceColumnForSprintsTable: migrated (0.0023s) ===

Query plan for the backfill migration

QUERY PLAN                                                                                             
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on sprints  (cost=4177.53..4747.18 rows=16142 width=438) (actual time=3155.937..3155.940 rows=0 loops=1)
   Buffers: shared hit=653237 read=1856 dirtied=4159 written=2046
   I/O Timings: read=733.711
   ->  Hash Join  (cost=4177.53..4747.18 rows=16142 width=438) (actual time=56.372..118.406 rows=16223 loops=1)
         Hash Cond: (t.id = sprints.id)
         Join Filter: ((sprints.sequence IS NULL) OR (sprints.sequence <> t.row_number))
         Buffers: shared hit=29547 read=193 dirtied=28
         I/O Timings: read=11.324
         ->  Subquery Scan on t  (cost=2397.56..2884.25 rows=16223 width=56) (actual time=19.436..56.130 rows=16223 loops=1)
               Buffers: shared hit=14172 read=64
               I/O Timings: read=1.236
               ->  WindowAgg  (cost=2397.56..2722.02 rows=16223 width=24) (actual time=19.424..46.442 rows=16223 loops=1)
                     Buffers: shared hit=14172 read=64
                     I/O Timings: read=1.236
                     ->  Sort  (cost=2397.56..2438.12 rows=16223 width=16) (actual time=19.300..23.036 rows=16223 loops=1)
                           Sort Key: s1.iterations_cadence_id, s1.start_date
                           Sort Method: quicksort  Memory: 1145kB
                           Buffers: shared hit=14172 read=64
                           I/O Timings: read=1.236
                           ->  Index Scan using sequence_is_unique_per_iterations_cadence_id on sprints s1  (cost=0.29..1263.11 rows=16223 width=16) (actual time=0.057..12.398 rows=16223 loops=1)
                                 Index Cond: (iterations_cadence_id IS NOT NULL)
                                 Buffers: shared hit=14169 read=64
                                 I/O Timings: read=1.236
         ->  Hash  (cost=1577.18..1577.18 rows=16223 width=394) (actual time=36.809..36.809 rows=16223 loops=1)
               Buckets: 16384  Batches: 1  Memory Usage: 4530kB
               Buffers: shared hit=15372 read=129 dirtied=28
               I/O Timings: read=10.087
               ->  Index Scan using sprints_pkey on sprints  (cost=0.29..1577.18 rows=16223 width=394) (actual time=0.008..25.605 rows=16223 loops=1)
                     Buffers: shared hit=15372 read=129 dirtied=28
                     I/O Timings: read=10.087
 Planning Time: 42.524 ms
 Execution Time: 3156.431 ms
(32 rows)

Down

== 20211130165043 BackfillSequenceColumnForSprintsTable: reverting ============
== 20211130165043 BackfillSequenceColumnForSprintsTable: reverted (0.0000s) ===


== 20211126042235 AddSequenceColumnToSprintsTable: reverting ==================
-- remove_column(:sprints, :sequence, :integer)
   -> 0.0029s
== 20211126042235 AddSequenceColumnToSprintsTable: reverted (0.0065s) =========

Query plans

Updating sequence numbers for iterations within a cadence

sample query:

The sample query updates a cadence containing 128 iterations. (the cadence belongs to a GitLab team member. SELECT COUNT(id), iterations_cadence_id FROM sprints GROUP BY iterations_cadence_id ORDER BY 1 DESC LIMIT 1;)

UPDATE sprints SET sequence=t.row_number
FROM ( 
SELECT id, row_number() OVER (ORDER BY start_date) FROM sprints 
WHERE iterations_cadence_id = 913
) as t
WHERE t.id=sprints.id AND (sprints.sequence IS DISTINCT FROM t.row_number);

query plan:

QUERY PLAN                                                                                         
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on sprints  (cost=172.19..519.06 rows=127 width=438) (actual time=48.964..48.968 rows=0 loops=1)
   Buffers: shared hit=4738 read=248 dirtied=298 written=28
   I/O Timings: read=11.886
   ->  Nested Loop  (cost=172.19..519.06 rows=127 width=438) (actual time=0.384..2.543 rows=128 loops=1)
         Buffers: shared hit=445 read=29 dirtied=4
         I/O Timings: read=0.964
         ->  Subquery Scan on t  (cost=171.90..175.42 rows=128 width=56) (actual time=0.332..0.780 rows=128 loops=1)
               Buffers: shared hit=88 read=2 dirtied=4
               I/O Timings: read=0.061
               ->  WindowAgg  (cost=171.90..174.14 rows=128 width=20) (actual time=0.318..0.641 rows=128 loops=1)
                     Buffers: shared hit=88 read=2 dirtied=4
                     I/O Timings: read=0.061
                     ->  Sort  (cost=171.90..172.22 rows=128 width=12) (actual time=0.310..0.358 rows=128 loops=1)
                           Sort Key: sprints_1.start_date
                           Sort Method: quicksort  Memory: 31kB
                           Buffers: shared hit=88 read=2 dirtied=4
                           I/O Timings: read=0.061
                           ->  Index Scan using index_sprints_iterations_cadence_id on sprints sprints_1  (cost=0.29..167.42 rows=128 width=12) (actual time=0.074..0.266 rows=128 loops=1)
                                 Index Cond: (iterations_cadence_id = 913)
                                 Buffers: shared hit=85 read=2 dirtied=4
                                 I/O Timings: read=0.061
         ->  Index Scan using sprints_pkey on sprints  (cost=0.29..2.67 rows=1 width=394) (actual time=0.012..0.012 rows=1 loops=128)
               Index Cond: (id = t.id)
               Filter: (sequence IS DISTINCT FROM t.row_number)
               Buffers: shared hit=357 read=27
               I/O Timings: read=0.902
 Planning Time: 15.654 ms
 Execution Time: 49.360 ms
(28 rows)

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by euko

Merge request reports