Skip to content

[#910] Add caching of address/alias mappings to morley-client

Nikolay Yakimov requested to merge lierdakil/#910-cache-address-alias-map into master

Alternative design: instead of Bimap could use two dependent maps. For the little benefit we get (that is, matching alias/address kind is enforced by GHC), it seems a bit of a pointless kludge. But feel free to convince me otherwise.

Description

Reduce the number of pointless calls to octez-client by caching the result of list known contracts. The first iteration just invalidates the cache whenever the alias store is touched, the second tries to be a little smarter about when to invalidate and when to update in-place.

Some rather unscientific measurements (I didn't want to waste several hours on re-running the test-suite, but I did try to run in the same conditions more or less, so it should at least show trends), running morley-test with --cleveland-mode=only-network -j1 (-j1 is mostly to improve repeatability):

master:

real    7m54,352s
user    6m46,676s
sys     0m35,126s

a0e2b42d (simple cache as [(Text, Text)]:

real    7m16,076s
user    6m7,713s
sys     0m32,266s

0f5d06a7 (simple cache as Bimap SomeAlias Address):

real    7m25,322s
user    6m9,123s
sys     0m30,937s

cefc8158 (smarter cache as Bimap SomeAlias Address):

real    6m38,035s
user    5m41,177s
sys     0m30,349s

Simple Bimap seems a little slower than list. That's more or less within the margin of error, but it makes sense -- parsing addresses and constructing the Bimap takes time, and a "simple" cache invalidates the cache entirely whenever the alias store is touched, so it would add up. Since we don't do dramatically more reads than writes in our scenarios, any efficiency gained from logarithmic lookups is overshadowed by the inefficiency of constructing the map itself.

This might seem like a "simple" cache not doing much, and a "smarter" one being leagues better. This story changes on CI somewhat, however, as when comparing f9eceed2 against cefc8158 we see a rather marginal difference. The difference is -j1, AFAICT. In theory, this shouldn't matter much, as network tests are single-threaded anyway, but apparently it does. We could add -j1 to CI for network tests, that would save on a few CPU cycles, too. Still, the question of whether the more complicated logic is worth it remains open.

Additionally, some comparisons against another recent fairly selected CI run (specifically this pipeline which seems reasonably representative):

test old this MR
morley 13:18 11:43
morley-client¹ 5:37 3:32
lorentz 9:02² 6:36
cleveland 16:16 11:50

¹ Note that morley-client is faster mostly due to f9eceed2, but there is some speed-up due to caching, too.
² This 9:02 is a bit on the high end, I found runs ending in 8:02

Related issue(s)

Resolves #910 (closed)

Checklist for your Merge Request

Related changes (conditional)

  • Tests (see short guidelines)

    • If I added new functionality, I added tests covering it.
    • If I fixed a bug, I added a regression test to prevent the bug from silently reappearing again.
  • Documentation

    • I checked whether I should update the docs and did so if necessary:
    • I updated changelog files of all affected packages released to Hackage if my changes are externally visible.

Stylistic guide (mandatory)

Edited by Nikolay Yakimov

Merge request reports