ubuntu-20.04-s390x-all CI job is very flaky

The ubuntu-20.04-s390x-all CI job is very flaky. Something weird seems to be up with the runner (this runs on our s390x VM).

This is a passing job from a recent successful merge: https://gitlab.com/qemu-project/qemu/-/jobs/6089089816

That took 38 minutes to run.

This is a failing job for the same commit: https://gitlab.com/qemu-project/qemu/-/jobs/6086439717

The failing job took 58 minutes to run. Looking at the embedded timestamps in the log, it took over 26 minutes to do the configure-and-compile phase of the run, whereas the passing job did that in about 21 minutes.

In other example failing or succeeding jobs I've seen the configure-and-compile phase be between 13 minutes and 39 minutes. So the entire run of the job seems like it can sometimes slow down by 2x or more, which is enough to put it into the range where either the whole job or individual tests time out.

I look in on the machine from time to time and it doesn't seem to be doing anything it shouldn't that would be eating CPU. So I'm not sure why the performance of the CI job would vary so wildly.

Edited by Peter Maydell
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information