CI: Work around sccache flakiness
What
Adds a wrapper script that works around a flakiness in sccache
by attempting to start it repeatedly for a bounded number of tries.
We also activate sccache debug logs to ease debugging in the future.
Why
I've been seeing a few sccache related flaky failures recently.
- https://gitlab.com/tezos/tezos/-/jobs/5850289404
- https://gitlab.com/tezos/tezos/-/jobs/5850289446
- https://gitlab.com/tezos/tezos/-/jobs/5850289449
- https://gitlab.com/tezos/tezos/-/jobs/5850289481
We tracked down the issue here. I don't have the skills to patch sccache directly, so I propose this change until we can get our hands on a patched version.
Manually testing the MR
- Here's a job where you can find the log in the artifacts: https://gitlab.com/tezos/tezos/-/jobs/5851434830
- It's hard to ensure that the flakiness is gone, but I've ran a few pipelines that do nothing but start and stop sccache, and with the wrapper I did not observe any failures. We'll also be able see whether the number of flaky opam jobs decreases in the flaky test dashboard.
Checklist
-
Document the interface of any function added or modified (see the coding guidelines) -
Document any change to the user interface, including configuration parameters (see node configuration) -
Provide automatic testing (see the testing guide). -
For new features and bug fixes, add an item in the appropriate changelog ( docs/protocols/alpha.rst
for the protocol and the environment,CHANGES.rst
at the root of the repository for everything else). -
Select suitable reviewers using the Reviewers
field below. -
Select as Assignee
the next person who should take action on that MR
Edited by Arvid Jakobsson