Skip to content

Draft: Bisect E2E test failure

James Liu requested to merge jliu/bisect-test-failure into master

What does this MR do and why?

In !166025 (closed), we started to see consistent failures in one of the E2E QA tests related to repository housekeeping. Example: https://gitlab.com/gitlab-org/gitlab/-/jobs/7817471056

Failures:
  1) Create Repository Usage Quota matches cloned repo usage to reported usage
     Failure/Error:
               Support::Retrier.retry_until(max_duration: 60, sleep_interval: 5) do
                 # This should perform the same deduplication as in the local repo
                 project.perform_housekeeping
       
                 project.statistics[:repository_size].to_i != initial_size
               end
     
     QA::Support::Repeater::WaitExceededError:
       Wait failed after 60 seconds
     # ./qa/support/repeater.rb:74:in `repeat_until'
     # ./qa/support/retrier.rb:44:in `retry_until'
     # ./qa/specs/features/api/3_create/repository/storage_size_spec.rb:64:in `block (3 levels) in <module:QA>'
     # /builds/gitlab-org/gitlab/spec/support/fast_quarantine.rb:22:in `block (2 levels) in <top (required)>'
     # ./qa/specs/runner.rb:70:in `perform'
     # ./qa/scenario/template.rb:10:in `block in perform'
     # ./qa/scenario/template.rb:8:in `perform'
     # ./qa/scenario/template.rb:35:in `perform'
     # ./qa/scenario/template.rb:10:in `block in perform'
     # ./qa/scenario/template.rb:8:in `perform'
     # ./qa/scenario/bootable.rb:52:in `launch!'
Finished in 3 minutes 26.7 seconds (files took 0.86821 seconds to load)
3 examples, 1 failure
Failed examples:
rspec ./qa/specs/features/api/3_create/repository/storage_size_spec.rb:31 # Create Repository Usage Quota matches cloned repo usage to reported usage
Randomized with seed 15823
** Retry run did not finish successfully, job will be failed! **

The failures do not appear to be flakes, but it's also unclear what changes in Gitaly have caused the issue, since we haven't made any functional changes to housekeeping recently.

I also wasn't able to reproduce the failure locally by running the specific QA test:

gitlab-qa Test::Instance::Any CE http://gdk.test:3000 -- qa/specs/features/api/3_create/repository/storage_size_spec.rb

This MR allows us to bisect the failure by manually changing GITALY_SERVER_VERSION and observing the pipeline results.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Edited by James Liu

Merge request reports

Loading