Skip to content

Fix a race condition in the shard population logic

Nick Thomas requested to merge (removed):53972-fix-fill-shards into master

What does this MR do?

When multiple GitLab processes start up at the same time, they all try to create a list of shards in the database simultaneously. Sometimes, this can race - two processes will try to add the same shard at the same time.

The introducing MR was meant to handle this, but I missed that the failing INSERT SQL command invalidated any surrounding transaction - and such a transaction existed. On retry, this caused the "duplicate key" error we rescue from to be promoted into an "invalid transaction" error, which we don't rescue from.

This MR removes the - overly paranoid - transaction, allowing the existing retry logic to operate correctly.

Unfortunately, this is quite hard to test, since we're using ActiveRecord's find_or_create_by method, which contains deep magic.

What are the relevant issue numbers?

Does this MR meet the acceptance criteria?

Closes #53972 (closed)

Edited by Nick Thomas

Merge request reports