Skip racey tests for data race jobs
Background
We run our tests to check for data races but we have some known failrues. At the moment the way to prevent ourselves from introducing new data races is looking at the job logs and grep of DATA RACE
and then count the occurrence of it, we do this in a separate stage and have a limit to the number of allowed data races.
Curent problems
- Checking the count is not very clear which tests actually failed, which leads to problems like !1770 (comment 358536071) which is a big time waste looking a logs.
- Checking count can be misleading because a single test can have multiple
DATA RACE
and if we make the test/function more complex there is a change we increase the DATA RACE even though we don't introduce any new data races.
Proposal
1. Skip each test that has a data race.
Use t.Skip to skip a test that is known to have data races inside of CI for example:
func TestSomething(t *testing.T) {
// $LINK_TO_ISSUE
if dataRaceTest() {
t.Skip()
}
}
func dataRaceTest() {
// Check an environment variable that specifies this is running with -race flag.
}
We can check the TESTFLAGS environment variable and if it has the -race
flag skip it.
2. Use junit test reports for race tests
Upload the junit reports for data races as well so that we can see the failures right inside of the merge request.
3. Remove allow failure for job
Remove allow_failure since known race tests will be skipped so if a new test fails we are aware of it.
Benefits
- We can see exactly what test failed
- We don't relay on the count of
DATA RACE
in the test logs, so there is a higher chance we don't introduce any new failed test since the count is not always the same because of flaky data races (sometimes they get a race sometimes they don't) - Each data race is known, documentation and has an issue attached to it.
Edited by Steve Xuereb