let_it_be and before_all incompatible with feature specs - causing cascading test failures
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
We're experiencing systematic test failures in feature specs due to an incompatibility between test-prof's let_it_be/before_all and Capybara's connection handling. This issue has existed previously (see #539416 (closed) with Chrome 123) but seems to have increased after the Chrome 138 upgrade.
The issue causes the before_all transaction to be rolled back prematurely, leading to:
-
Missing data errors (
Couldn't find User with 'id'=1) -
Foreign key violations (
Key (project_id)=(X) not present in table "projects") -
ActiveRecord::InvalidForeignKeyexceptions - PG::QueryCanceled: ERROR: canceling statement due to statement timeout
- !!! before_all transaction has been already rollbacked and could work incorrectly
Fixes
I've had to apply a number of emergency fixes to stabilise pipelines, removing let_it_be and before_all seems to resolve the failures:
Root Cause (unconfirmed)
Using example: https://gitlab.com/gitlab-org/gitlab/-/jobs/11767644421#L927 (this test has been fixed since by changing the let_it_be's to let!)
claude --> From the CI log at the end of the test run:
!!! before_all transaction has been already rollbacked and could work incorrectly
This is test-prof's own warning that the before_all transaction was rolled back prematurely
The Failure Pattern
All 20 failures occurred in spec/features/projects/jobs/permissions_spec.rb:
ActiveRecord::RecordNotFound:
Couldn't find Project with [WHERE "projects"."id" = $1]
# ./spec/features/projects/jobs/permissions_spec.rb:17:in `block (2 levels) in <top (required)>'
Line 17 is in the before block:
before do
sign_in(user)
project.enable_ci # ← Fails here because project is gone!
end
The project from let_it_be_with_reload(:project) no longer exists in the database.
Why It Fails in CI But Not Locally
Local execution:
- Run
permissions_spec.rbalone →✅ passes -
before_alltransaction is created and maintained properly
CI execution:
- 30+ minutes of other feature specs run first
- Many complete successfully
- Then
permissions_spec.rbruns (starts at18:52:23) -
Immediately fails - the
before_alltransaction is already corrupt from previous specs - All 20 examples in the file fail with
Couldn't find Project
<-- claude
Chrome upgrade possibly introduced timing changes that trigger connection resets at different times, making this race condition more common.
Related Failures
Recent test failures after the chrome upgrade: https://gitlab.com/gitlab-org/gitlab/-/issues/577767+
Example of similar failure before the upgrade: [Test] spec/features/ide_spec.rb | IDE director... (#539416 - closed)
Potential Solution
Convert all instances in feature specs and feature-specific shared examples:
let_it_be(:user) → let!(:user)
let_it_be_with_reload(:project) → let!(:project)
let_it_be_with_refind(:issue) → let!(:issue)
before_all do → before do
Add validation to spec_helper.rb to prevent new violations, something like:
config.before(:context, type: :feature) do
if instance_variables.any? { |var| var.to_s.include?('let_it_be') }
raise "⚠️ let_it_be is not allowed in feature specs! Use let! instead."
end
end
Performance Impact
let! creates records before each example vs once before all examples. Spec duration will increase but this is a worthwhile trade-off for deterministic, reliable tests. This could be measured in an MR
Alternate approach
Just fix those tests that fail as the majority of tests seem to be unaffected