Skip to content

Flaky tests: Fix TestAdaptiveCalculator flaky tests (second take)

Quang-Minh Nguyen requested to merge qmnguyen0711/fix-calculator-tests-v2 into master

This is a revert of a revert (original MR). In that MR, we fixed a flaky test but it was reverted accidentally. This MR brings it back.


In TestAdaptiveCalculator, we set the calibration duration to 10 milliseconds. This test setup uses a manual ticker. This calibration duration is irrelevant to the actual calibration cycle of the calculation. That value was supposed to be a placeholder.

Unfortunately, the calculator uses this value to determine a timeout when polling events from the watchers. Although event polling is supposed to be instant, it might not be the case for slow machines. Thus, on CIs, even polling might return deadline exceeded unintentionally. This commit fixes this flake by setting the calibration duration to an unrealistically high value.

This flake can be reproduced in the local environment by adding a small delay to the implementation and hammer the test for enough number of times.

diff --git a/internal/limiter/adaptive_calculator.go b/internal/limiter/adaptive_calculator.go
index 793fc853c..d3f7f7fcf 100644
--- a/internal/limiter/adaptive_calculator.go
+++ b/internal/limiter/adaptive_calculator.go
@@ -4,6 +4,7 @@ import (
        "context"
        "fmt"
        "math"
+       "math/rand"
        "sync"
        "sync/atomic"
        "time"
@@ -196,6 +197,8 @@ func (c *AdaptiveCalculator) pollBackoffEvent(ctx context.Context) {
        ctx, cancel := context.WithTimeout(ctx, c.calibration)
        defer cancel()

+       time.Sleep(time.Duration(rand.Intn(11)) * time.Millisecond)
+
        for _, w := range c.watchers {
                // If the context is cancelled, return early.
                if ctx.Err() != nil {
diff --git a/internal/limiter/adaptive_calculator_test.go b/internal/limiter/adaptive_calculator_test.go
index 5c2a5dae7..d4965bdf4 100644
--- a/internal/limiter/adaptive_calculator_test.go
+++ b/internal/limiter/adaptive_calculator_test.go
@@ -55,8 +55,6 @@ func TestAdaptiveCalculator_realTimerTicker(t *testing.T) {
 func TestAdaptiveCalculator(t *testing.T) {
        t.Parallel()

-       testhelper.SkipQuarantinedTest(t, "https://gitlab.com/gitlab-org/gitaly/-/issues/5467")
-
        tests := []struct {
                desc       string
                limits     []AdaptiveLimiter

Before

go test -race -failfast --count 100 ./internal/limiter/ -run TestAdaptiveCalculator
--- FAIL: TestAdaptiveCalculator (0.28s)
    --- FAIL: TestAdaptiveCalculator/Additive_increase_multiple_limits_until_a_backoff_event (0.05s)
        adaptive_calculator_test.go:575:
            	Error Trace:	/Users/qmnguyen/www/gitlab-development-kit/gitaly/internal/limiter/adaptive_calculator_test.go:575
            	Error:      	Not equal:
            	            	expected: []int{25, 26, 27, 28, 29, 30, 15, 16}
            	            	actual  : []int{25, 26, 27, 28, 29, 30, 31, 15}

            	            	Diff:
            	            	--- Expected
            	            	+++ Actual
            	            	@@ -7,4 +7,4 @@
            	            	  (int) 30,
            	            	- (int) 15,
            	            	- (int) 16
            	            	+ (int) 31,
            	            	+ (int) 15
            	            	 }
            	Test:       	TestAdaptiveCalculator/Additive_increase_multiple_limits_until_a_backoff_event
FAIL
FAIL	gitlab.com/gitlab-org/gitaly/v16/internal/limiter	0.879s

After

go test -race -failfast --count 100 ./internal/limiter/ -run TestAdaptiveCalculator
ok  	gitlab.com/gitlab-org/gitaly/v16/internal/limiter	63.266s

Merge request reports