Skip to content

Draft: [#399] Parallel network tests

Nikolay Yakimov requested to merge lierdakil/#399-parallel-nettests into master

Description

This MR uses "lock moneybag access" approach, outlined in the issue. Also does a bunch of other things, see individual commit descriptions.

The caveat is, locking moneybag access essentially prevents full parallelization.

Alternatives considered and rejected (some more tentatively than others):

  • Moneybag-per-test. With hundreds of small moneybags, we'll just burn all funds on fees. Additionally, we don't have a reasonable model to compute projected fees ahead of time, hence each test would have to specify the amount of funds it intends to use. As costs tend to drift, this would be an unreasonable maintenance burden.
  • Moneybag-per-thread. Even if we had a reasonable model to compute projected fees ahead of time, we wouldn't be able to predict costs per thread. This might, however, be workable. One option is to have job moneybag contain an overabundance of funds, and we can split those funds equally between threads. Worst case, job moneybag would need Nthreads*totalCost funds.

Is there a measurable speed-up? Not really:

local-chain test typical runtime (minutes) parallel runtime (minutes)
morley-client 9 7
cleveland 26 23
morley 17 17
lorentz 17 15

Frankly, this is close to the margin of error.

For why that is, consider that the absolute majority of the time is spent waiting for local-chain, and most operations are run as moneybag. And it's not like parallelization comes for free, locks have overhead, and lack of sharing aliases between tests implies more moneybag operations need to be run. Thus, the fact there's any measurable speed-up at all is kind of surprising.

One possible improvement is to auto-batch moneybag operations. At the time of writing, I didn't think this through, so can't say how complicated that would be, but if doable, it should have a much larger impact.

Update

I've pushed a couple experimental commits that implement per-thread moneybags. The code is somewhat ad-hoc, just to see if there are measurable improvements, and apparently there are some:

local-chain test typical runtime (minutes) parallel runtime (minutes)
morley-client 9 5.5
cleveland 26 17.5
morley 17 13.5
lorentz 17 11.5

Now, this is still far from a mind-blowing improvement one would hope for. That being said, running these tests locally, I see a much larger speedup. Whether this discrepancy is due to fewer cores allocated to each CI job, or due to overcommitment of threads, I couldn't tell.

For completeness, here is very unscientific measurements of running tests locally at 8 cores:

local-chain test typical runtime (minutes) parallel runtime (minutes)
morley-client 9 1:36
cleveland 26 8:06
morley 17 2:40
lorentz 17 3:45

Purely from CI perspective, this angle isn't worth pursuing further IMO, improvements are less than 50%, and that's not worth the significantly increased complexity. That is, unless we can increase the number of threads in CI jobs by a factor of several, but even then there are significant tradeoffs. From the perspective of running local tests, this might be justified to some extent.

Note that in local testing, I occasionally see tests failing with thread synchronization issues. That either suggests I messed up locks somewhere, or that the code that initializes per-thread moneybags needs to wait for the next block explicitly.

Related issue(s)

Resolves #399 (closed) (eventually, hopefully)

Checklist for your Merge Request

Related changes (conditional)

  • Tests (see short guidelines)

    • If I added new functionality, I added tests covering it.
    • If I fixed a bug, I added a regression test to prevent the bug from silently reappearing again.
  • Documentation

    • I checked whether I should update the docs and did so if necessary:
    • I updated changelog files of all affected packages released to Hackage if my changes are externally visible.

Stylistic guide (mandatory)

Edited by Nikolay Yakimov

Merge request reports

Loading