Reduce our test suite duration

A note on parallelization

Parallelization is a good short-term solution to quickly reduce the feedback cycle for developers before their contribution can be reviewed and merged. However, its cost can become a problem and ideally any developer should be able to run the full test suite in a resonable time on their own computer. Right now, the full test suite takes 7 hours to finish without parallelization.

1. Monitor and profile our RSpec suite

Our Redash dashboard will be our source of truth:

http://redash.gitlab.com/dashboard/test-suite-statistics

Add rspec_profiling to the test environment, run it on master and store the data into a new PG database (https://gitlab.com/gitlab-com/infrastructure/issues/976): !7854 (merged)
- Track EE as well: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/10226
Exploit this DB using Redash (gitlab-com/infrastructure#877)
Create Redash graphs:
- "The 50 most ..." (https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/7854#note_19287450)
- Consolidate suite duration per day / week in order to see the progress made (or not): #28777 (closed)

We need to consolidate average duration per day to see the trend go down (or at least stabilize / increase more slowly than today)

2. Create project without repos when possible

Investigate bypassing the login procedure for most feature specs - #30196 (closed)

Use sign_in instead of login_as when we're not testing login flow: !10296 (merged)
Except where necessary, use sign_in over login_as in features: !10320 (merged)

3. Some techniques that can reduce tests duration

Consider using database deletion over truncation in tests for performance: #30783 (closed) / !10624 (closed)
#23036 (closed) - Migrate all Spinach tests to RSpec (since we currently only measure RSpec tests duration, our reporting is actually not complete)
Reduce the main offending test files to reduce their overall duration: http://redash.gitlab.com/queries/15

Once we don't have any test file that is very long to execute, we will need to look for particular examples that don't take long compared to the suite duration but that take long compared to what they should be testing.

Use `before(:context)` / `after(:context)` (with caution!)

In some cases, using before(:context) / after(:context) should be used, for instance in this particular case, the test duration was reduced from ~8 minutes to 11 seconds! That's because in that case, the setup was where most of time was spent (~10s) so it made sense to do it only once.

Use rspec-set to speed up examples: !10551 (merged)

Group expectations

Even though in theory we should have one expectation per test, in practice it's often OK to group several expectations (possibly using aggregate_failures), particularly when the test setup is heavy.

In the same idea you can use the have_attributes matcher, so instead of:

# Here the setup is run twice
it { expect(model.name).to eq('foo') }
it { expect(model.size).to eq(2) }

you can do:

# Here the setup is run only once
it { expect(model).to have_attributes(name: 'foo', size: 2) }

Chain expectations

RSpec allows to use compund matchers, so instead of:

# Here the setup is run twice
it { is_expected.to start_with('Foo') }
it { is_expected.to end_with('Bar') }

you can do:

# Here the setup is run only once
it { is_expected.to start_with('Noel').and end_with('Rappin') }

Move some callbacks to the Service layer

Callbacks that are only executed on :create or on :update are good candidates to go into the Service layer for a few reasons:

Models shouldn't care about lifecycle-specific behaviors
These extra behaviors won't be triggered when using spec factories (at least until we use the services in factories...) meaning less side-effects and a less slower test suite in the end.

Examples:

before_validation :generate_password, on: :create
before_validation :signup_domain_valid?, on: :create, if: ->(user) { !user.created_by_id }
after_create :post_create_hook

from app/models/user.rb.

Move validations to the Service layer – #27100 (moved)

In the same idea, we could move validations to the service layer to remove side-effects from Model.create.

Examples:

validate :namespace_uniq, if: ->(user) { user.username_changed? }
validate :avatar_type, if: ->(user) { user.avatar.present? && user.avatar_changed? }
validate :unique_email, if: ->(user) { user.email_changed? }
validate :owns_notification_email, if: ->(user) { user.notification_email_changed? }
validate :owns_public_email, if: ->(user) { user.public_email_changed? }

from app/models/user.rb.

Consider removing tests that are duplicated

We already have a very good coverage and that probably means that we're over-testing some parts of the application. We might want to remove some tests.

4. Don't depend on gitlab-test / external / actual repositories in tests

Consider using Rugged to build repo programmatically instead of relying on gitlab-test: gitlab-org/gitlab-ce#23517
- We need to abstract the programmatic creation of repos (starting with some helpers)
- We should document how to use these helpers
- We should tell people to use this new strategy by default

Prior work:

Test suite statistics: https://gitlab.com/gitlab-org/gitlab-ce/issues/23034
!6197 (merged)
!6722 (merged)
!6723 (merged)
!6730 (merged)
!7282 (closed)
!7283 (merged)
!7512 (merged)
!7779 (merged)
!7784 (merged)
!7854 (merged)

Edited Dec 18, 2017 by Nick Thomas