Skip to content

testdb: Remove global advisory lock when creating Praefect database

Patrick Steinhardt requested to merge pks-testdb-remove-advisory-lock into master

Setting up the Praefect database for our tests works by populating a shared template database with our migrations. This template database is then used as the source of the actual database we're using for the test and is shared across our tests. It is thus essentially a global shared resource not only across tests of a single testrun, but even across multiple runs of the testsuite. This brings multiple problems with it:

- We need to acquire a global advisory lock so that no two tests try
  to update the template database at the same point in time. This
  sequentializes tests which want to create a test database.

- There is a race between creating the template database and the
  real database as we release the lock before we create the real
  database. Consequentially, any concurrently running test with a
  different set of migrations could modify the state of the template
  database.

- We need to have a bunch of logic to detect when the template
  database is not in a state that would be recognized by us.
  Overall, this solution is kind of complex.

Now the question is why we have decided to do it like this in the first place. The logic has been introduced via c61cdadb (Replace in-memory queue with Postgres implementation, 2021-07-14), which does not mention why we use a shared template database instead of just creating the final target database directly. There are two reasons I can think of though:

1. It stresses the migration logic a bit more. This is in my opinion
   not a good argument though as the migration logic should ideally
   be tested as a standalone unit anyway.

2. We want to avoid having to re-run database migrations on every
   test in order to be more efficient.

I doubt that (1) had been the reason, and assume it was (2). Benchmarks show though that this optimization does not quite work in our favor though:

Benchmark 1: with advisory lock
  Time (mean ± σ):     12.042 s ±  0.487 s    [User: 42.657 s, System: 8.606 s]
  Range (min … max):   11.551 s … 12.643 s    5 runs

Benchmark 2: without advisory lock
  Time (mean ± σ):     10.928 s ±  0.080 s    [User: 43.569 s, System: 9.402 s]
  Range (min … max):   10.816 s … 11.007 s    5 runs

Summary
  'without advisory lock' ran
    1.10 ± 0.05 times faster than 'with advisory lock'

As you can see, the tests run about 10% faster without the advisory lock. This is because the tests can now be parallelized better as they do not depend on a global lock anymore. That being said, both user and system time have increased as we now spend more time initializing the database. But that increase is seemingly getting absorbed by improved parallelization.

Remove the global advisory lock. This simplifies the code, reduces the amount of global state we have and speeds up our tests.

Merge request reports