Reordering issues can cause deadlocks
We observed two deadlocks today on gitlab.com caused by reordering issues on boards:
2018-03-20_14:26:42.33116 postgres-01 postgresql: 2018-03-20 14:26:42 GMT [104434]: [14-1] LOG: process 104434 detected deadlock while waiting for ShareLock on transaction 2021899451 after 5000.055 ms
2018-03-20_14:26:42.33128 postgres-01 postgresql: 2018-03-20 14:26:42 GMT [104434]: [15-1] DETAIL: Process holding the lock: 92158. Wait queue: .
2018-03-20_14:26:42.33130 postgres-01 postgresql: 2018-03-20 14:26:42 GMT [104434]: [16-1] CONTEXT: while updating tuple (460146,8) in relation "issues"
2018-03-20_14:26:42.33132 postgres-01 postgresql: 2018-03-20 14:26:42 GMT [104434]: [17-1] STATEMENT: UPDATE "issues" SET "relative_position" = 1073741512, "updated_at" = '2018-03-20 14:26:37.328911' WHERE "issues"."id" = 9826807
2018-03-20_14:26:42.33133 postgres-01 postgresql: 2018-03-20 14:26:42 GMT [104434]: [18-1] ERROR: deadlock detected
2018-03-20_14:26:42.33135 postgres-01 postgresql: 2018-03-20 14:26:42 GMT [104434]: [19-1] DETAIL: Process 104434 waits for ShareLock on transaction 2021899451; blocked by process 92158.
2018-03-20_14:26:42.33140 postgres-01 postgresql: Process 92158 waits for ShareLock on transaction 2021898621; blocked by process 104434.
2018-03-20_14:26:42.33144 postgres-01 postgresql: Process 104434: UPDATE "issues" SET "relative_position" = 1073741512, "updated_at" = '2018-03-20 14:26:37.328911' WHERE "issues"."id" = 9826807
2018-03-20_14:26:42.33146 postgres-01 postgresql: Process 92158: UPDATE "issues" SET "relative_position" = 1073741513, "updated_at" = '2018-03-20 14:26:37.329846' WHERE "issues"."id" = 9648286
And
2018-03-20_14:26:47.53438 postgres-01 postgresql: 2018-03-20 14:26:47 GMT [87058]: [8-1] LOG: process 87058 detected deadlock while waiting for ShareLock on transaction 2021901187 after 5000.076 ms
2018-03-20_14:26:47.53458 postgres-01 postgresql: 2018-03-20 14:26:47 GMT [87058]: [9-1] DETAIL: Process holding the lock: 93563. Wait queue: 107179.
2018-03-20_14:26:47.53465 postgres-01 postgresql: 2018-03-20 14:26:47 GMT [87058]: [10-1] CONTEXT: while updating tuple (238123,20) in relation "issues"
2018-03-20_14:26:47.53471 postgres-01 postgresql: 2018-03-20 14:26:47 GMT [87058]: [11-1] STATEMENT: UPDATE "issues" SET "relative_position" = 1073741514, "updated_at" = '2018-03-20 14:26:42.524914' WHERE "issues"."id" = 9757365
2018-03-20_14:26:47.53478 postgres-01 postgresql: 2018-03-20 14:26:47 GMT [87058]: [12-1] ERROR: deadlock detected
2018-03-20_14:26:47.53484 postgres-01 postgresql: 2018-03-20 14:26:47 GMT [87058]: [13-1] DETAIL: Process 87058 waits for ShareLock on transaction 2021901187; blocked by process 93563.
2018-03-20_14:26:47.53496 postgres-01 postgresql: Process 93563 waits for ShareLock on transaction 2021900881; blocked by process 87058.
2018-03-20_14:26:47.53507 postgres-01 postgresql: Process 87058: UPDATE "issues" SET "relative_position" = 1073741514, "updated_at" = '2018-03-20 14:26:42.524914' WHERE "issues"."id" = 9757365
2018-03-20_14:26:47.53515 postgres-01 postgresql: Process 93563: UPDATE "issues" SET "relative_position" = 1073741512, "updated_at" = '2018-03-20 14:26:42.531927' WHERE "issues"."id" = 9826807
In theory it should be possible to avoid deadlocks by performing these updates in a deterministic manner -- ie, by sorting all the issues that need to be updated by id and updating them in that order. In practice of course that can be tricky, for example if the numbers get too close and they need to be rebalanced recursively.
Another option would be to perform a "select for update" on the board first in the same transaction to effectively serialize the reorder operations. That only works if this is the only transaction that updates multiple issues in a single transaction but that seems likely to be the case. For what it's worth it seems quite plausible that the algorithm to do these updates has only been analyzed and tested in serial execution cases so serializing the reorderings (or using serializable transaction isolation) seems like it would be safer anyways.