Skip to content

Draft: Double-checked find-or-create-by to avoid subtrans

Andreas Brandl requested to merge ab/find-or-create-by into master

What does this MR do?

In order to reduce the likelihood of using subtransactions (and hence their frequency), this changes .safe_find_or_create_by to use "double-checking" (borrowed from double-checked locking):

  1. We optimize for the case when the record already exists (which is also what we suggest in !68245 (merged))
  2. If the record already exists, we only need a single SELECT and no subtransaction
  3. If the record does not yet exist, we perform an additional SELECT compared to before (and still use a subtransaction).

For the other case (record does likely not exist), we can use .create_or_find_by and turn this into a PG upsert (which also does not need subtransactions).

The impact of this depends on how often our current implementation of .safe_find_or_create_by actually hits an existing record.

Current behavior

In both cases (record exists / does not exist), Rails starts a subtransaction.

> Foo.transaction { Foo.safe_find_or_create_by(username: 'test') } 
  TRANSACTION (0.2ms)  BEGIN /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
  TRANSACTION (0.1ms)  SAVEPOINT active_record_1 /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
  Foo Load (0.4ms)  SELECT "foos".* FROM "foos" WHERE "foos"."username" = 'test' LIMIT 1 /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
  Foo Create (0.5ms)  INSERT INTO "foos" ("username") VALUES ('test') RETURNING "id" /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
  TRANSACTION (0.1ms)  RELEASE SAVEPOINT active_record_1 /*application:console,line:/lib/gitlab/database.rb:214:in `block in transaction'*/
  TRANSACTION (11.8ms)  COMMIT /*application:console,line:/lib/gitlab/database.rb:235:in `commit'*/
=> #<Foo:0x0000560d07f9a370 id: 6, username: "test">
> Foo.transaction { Foo.safe_find_or_create_by(username: 'test') } 
  TRANSACTION (0.2ms)  BEGIN /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
  TRANSACTION (0.2ms)  SAVEPOINT active_record_1 /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
  Foo Load (0.3ms)  SELECT "foos".* FROM "foos" WHERE "foos"."username" = 'test' LIMIT 1 /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
  TRANSACTION (0.2ms)  RELEASE SAVEPOINT active_record_1 /*application:console,line:/lib/gitlab/database.rb:214:in `block in transaction'*/
  TRANSACTION (0.2ms)  COMMIT /*application:console,line:/lib/gitlab/database.rb:235:in `commit'*/
=> #<Foo:0x0000560d0eec7518 id: 1, username: "test">

Double-checked behavior

  • Record exists already: Only a single select, no subtransaction needed
  • Record does not yet exist: One additional select before we start the subtransaction
[2] pry(main)> Foo.transaction { Foo.safe_find_or_create_by(username: 'test') } 
  TRANSACTION (0.2ms)  BEGIN /*application:console,line:/lib/gitlab/database/schema_cache_with_renamed_table.rb:25:in `columns'*/
  Foo Load (0.6ms)  SELECT "foos".* FROM "foos" WHERE "foos"."username" = 'test' LIMIT 1 /*application:console,line:/app/models/application_record.rb:66:in `safe_find_or_create_by'*/
  TRANSACTION (0.1ms)  SAVEPOINT active_record_1 /*application:console,line:/app/models/application_record.rb:73:in `block in safe_find_or_create_by'*/
  Foo Load (0.2ms)  SELECT "foos".* FROM "foos" WHERE "foos"."username" = 'test' LIMIT 1 /*application:console,line:/app/models/application_record.rb:73:in `block in safe_find_or_create_by'*/
  Foo Create (0.5ms)  INSERT INTO "foos" ("username") VALUES ('test') RETURNING "id" /*application:console,line:/app/models/application_record.rb:73:in `block in safe_find_or_create_by'*/
  TRANSACTION (0.1ms)  RELEASE SAVEPOINT active_record_1 /*application:console,line:/lib/gitlab/database.rb:214:in `block in transaction'*/
  TRANSACTION (11.1ms)  COMMIT /*application:console,line:/lib/gitlab/database.rb:235:in `commit'*/
=> #<Foo:0x0000562acf9eb150 id: 7, username: "test">
[3] pry(main)> Foo.transaction { Foo.safe_find_or_create_by(username: 'test') } 
  TRANSACTION (0.2ms)  BEGIN /*application:console,line:/app/models/application_record.rb:66:in `safe_find_or_create_by'*/
  Foo Load (0.3ms)  SELECT "foos".* FROM "foos" WHERE "foos"."username" = 'test' LIMIT 1 /*application:console,line:/app/models/application_record.rb:66:in `safe_find_or_create_by'*/
  TRANSACTION (0.2ms)  COMMIT /*application:console,line:/lib/gitlab/database.rb:235:in `commit'*/
=> #<Foo:0x0000562ad99f5ad8 id: 7, username: "test">

How to setup and validate locally (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Edited by Andreas Brandl

Merge request reports