Skip to content

Fix flaky spec throttle temporary email failure

Mario Celi requested to merge incident-10339-fix-flaky-timing-issue-spec into master

What does this MR do and why?

There was a very unlikely timing issue with this spec. We increment a counter for a given day. If the second increment happened in the first second of a day while the first happened in the last second in the day it wouldn't cause the check to be throttled.

Steps to find the flaky spec

This one was not so hard to find. Since the before block was using our rate limiter to check if the request should be throttled twice, I saw how not doing that check twice on the same time segment (day) would cause the spec to break.

Then answer was in Gitlab::ApplicationRateLimiter.throttled? implementation and then I could confirm that the failing pipelines where created close to EOD UTC, so that confirmed my theory. I was also able to use travel_to locally to make the spec fail if the throttled check was done in the last second of a day and in the first second of another day.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to gitlab-org/release/tasks#10339 (closed)

Edited by Mario Celi

Merge request reports