Draft: Double-checked find-or-create-by to avoid subtrans
What does this MR do?
In order to reduce the likelihood of using subtransactions (and hence their frequency), this changes .safe_find_or_create_by
to use "double-checking" (borrowed from double-checked locking):
- We optimize for the case when the record already exists (which is also what we suggest in !68245 (merged))
- If the record already exists, we only need a single
SELECT
and no subtransaction - If the record does not yet exist, we perform an additional
SELECT
compared to before (and still use a subtransaction).
For the other case (record does likely not exist), we can use .create_or_find_by
and turn this into a PG upsert (which also does not need subtransactions).
The impact of this depends on how often our current implementation of .safe_find_or_create_by
actually hits an existing record.
Current behavior
In both cases (record exists / does not exist), Rails starts a subtransaction.
> Foo.transaction { Foo.safe_find_or_create_by(username: 'test') }
TRANSACTION (0.2ms) BEGIN /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
TRANSACTION (0.1ms) SAVEPOINT active_record_1 /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
Foo Load (0.4ms) SELECT "foos".* FROM "foos" WHERE "foos"."username" = 'test' LIMIT 1 /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
Foo Create (0.5ms) INSERT INTO "foos" ("username") VALUES ('test') RETURNING "id" /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
TRANSACTION (0.1ms) RELEASE SAVEPOINT active_record_1 /*application:console,line:/lib/gitlab/database.rb:214:in `block in transaction'*/
TRANSACTION (11.8ms) COMMIT /*application:console,line:/lib/gitlab/database.rb:235:in `commit'*/
=> #<Foo:0x0000560d07f9a370 id: 6, username: "test">
> Foo.transaction { Foo.safe_find_or_create_by(username: 'test') }
TRANSACTION (0.2ms) BEGIN /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
TRANSACTION (0.2ms) SAVEPOINT active_record_1 /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
Foo Load (0.3ms) SELECT "foos".* FROM "foos" WHERE "foos"."username" = 'test' LIMIT 1 /*application:console,line:/app/models/application_record.rb:67:in `block in safe_find_or_create_by'*/
TRANSACTION (0.2ms) RELEASE SAVEPOINT active_record_1 /*application:console,line:/lib/gitlab/database.rb:214:in `block in transaction'*/
TRANSACTION (0.2ms) COMMIT /*application:console,line:/lib/gitlab/database.rb:235:in `commit'*/
=> #<Foo:0x0000560d0eec7518 id: 1, username: "test">
Double-checked behavior
- Record exists already: Only a single select, no subtransaction needed
- Record does not yet exist: One additional select before we start the subtransaction
[2] pry(main)> Foo.transaction { Foo.safe_find_or_create_by(username: 'test') }
TRANSACTION (0.2ms) BEGIN /*application:console,line:/lib/gitlab/database/schema_cache_with_renamed_table.rb:25:in `columns'*/
Foo Load (0.6ms) SELECT "foos".* FROM "foos" WHERE "foos"."username" = 'test' LIMIT 1 /*application:console,line:/app/models/application_record.rb:66:in `safe_find_or_create_by'*/
TRANSACTION (0.1ms) SAVEPOINT active_record_1 /*application:console,line:/app/models/application_record.rb:73:in `block in safe_find_or_create_by'*/
Foo Load (0.2ms) SELECT "foos".* FROM "foos" WHERE "foos"."username" = 'test' LIMIT 1 /*application:console,line:/app/models/application_record.rb:73:in `block in safe_find_or_create_by'*/
Foo Create (0.5ms) INSERT INTO "foos" ("username") VALUES ('test') RETURNING "id" /*application:console,line:/app/models/application_record.rb:73:in `block in safe_find_or_create_by'*/
TRANSACTION (0.1ms) RELEASE SAVEPOINT active_record_1 /*application:console,line:/lib/gitlab/database.rb:214:in `block in transaction'*/
TRANSACTION (11.1ms) COMMIT /*application:console,line:/lib/gitlab/database.rb:235:in `commit'*/
=> #<Foo:0x0000562acf9eb150 id: 7, username: "test">
[3] pry(main)> Foo.transaction { Foo.safe_find_or_create_by(username: 'test') }
TRANSACTION (0.2ms) BEGIN /*application:console,line:/app/models/application_record.rb:66:in `safe_find_or_create_by'*/
Foo Load (0.3ms) SELECT "foos".* FROM "foos" WHERE "foos"."username" = 'test' LIMIT 1 /*application:console,line:/app/models/application_record.rb:66:in `safe_find_or_create_by'*/
TRANSACTION (0.2ms) COMMIT /*application:console,line:/lib/gitlab/database.rb:235:in `commit'*/
=> #<Foo:0x0000562ad99f5ad8 id: 7, username: "test">
How to setup and validate locally (strongly suggested)
Does this MR meet the acceptance criteria?
Conformity
-
I have included changelog trailers, or none are needed. (Does this MR need a changelog?) -
I have added/updated documentation, or it's not needed. (Is documentation required?) -
I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?) -
I have added information for database reviewers in the MR description, or it's not needed. (Does this MR have database related changes?) -
I have self-reviewed this MR per code review guidelines. -
This MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines) -
I have followed the style guides. -
This change is backwards compatible across updates, or this does not apply.
Availability and Testing
-
I have added/updated tests following the Testing Guide, or it's not needed. (Consider all test levels. See the Test Planning Process.) -
I have tested this MR in all supported browsers, or it's not needed. -
I have informed the Infrastructure department of a default or new setting change per definition of done, or it's not needed.
Edited by Andreas Brandl