Add new client mode: proxy (!1943) · Merge requests · Tezos / tezos

Clément Hurlin requested to merge smelc/tezos:smelc-issue-154-proxy into master Jun 29, 2020

Fixes nomadic-labs/tezos#154

Branch: https://gitlab.com/smelc/tezos/-/commits/smelc-issue-154-proxy

opam CI branch: https://gitlab.com/smelc/tezos/-/commits/smelc-issue-154-proxy-opam

This MR adds a new mode to the client called the proxy mode. In this mode (which is specified by --mode proxy), the client performs some RPCs locally, by retrieving required data from the node using the /chains/main/blocks/head/context/raw/bytes RPC. RPCs done locally are the ones provided by Protocol_client_context.Lifted_protocol.rpc_services.

Implementation

This section has been contributed to the repo itself, see https://gitlab.com/smelc/tezos/-/blob/smelc-issue-154-proxy/src/lib_proxy/README.md

The client's CLI interface has been extended with the new proxy flag: 3275e1df1af3283ed6b0f3f51403f0ac97894c81.
The mockup's implementation of RPC_context.json has been moved to a new lib_mockup_proxy directory: lib_mockup_proxy/RPC_client.ml#L114. This implementation is parameterized by the RPC_directory.t tree to serve. This implementation of RPC_context is local: it executes RPCs locally instead of sending them to the node.
The mockup's implementation of RPC_context.json has been modified to expose the RPC_not_ found error. The proxy uses that to fallback sending a RPC to the node when a RPC cannot be performed locally.
The proxy then builds its own RPC client on top of the mockup's one: lib_proxy/RPC_client.ml. This provides an instance of RPC_context.json that is capable both of doing RPCs locally and if impossible it does RPCs the classic way. This implementation is used by a new Client_context.full dedicated to the proxy: unix_proxy.
Like the mockup, each implementation of the proxy registers itself by implementing registration.mli. Proxy implementation is protocol dependent because the /chains/main/blocks/head/context/raw/bytes RPC is.
A new implementation of Environment_context.Context has been added: proxy_context. This implementation is similar to memory_context but it can perform the .../raw/bytes RPC in its raw_get method.
The proxy_context's get function hereby calls proxy_getter's do_rpc function. proxy_getter is a thin wrapper over the protocol-dependent call to ../raw/bytes (more on the wrapper below). The wrapper calls one of the three protocol-dependent implementation: alpha, carthage, or genesis. Protocol-dependent implementations are instances of the PROTO_RPC module type.
When doing the ../raw/bytes RPC, the RPC wrapper stores the result in its own cache to avoid redoing an RPC call if done already. The cache is a tree like the one in memory_context. When an RPC is done, the subtree obtained is merged with the tree cached already (see function set_leaf in proxy_getter.ml).
The RPC wrapper proxy_getter also takes care of applying the take a parent tree heuristic (see function split_key in proxy_getter.mli. This heuristic, proposed by klakplok, can make the proxy retrieve a parent tree of the context (i.e. a prefix of a key actually requested); in order to reduce the number of RPC calls done. The point is that, if a long key is being retrieved, a sibling key is going to be requested soon; hence let's request the parent of both keys in a single RPC, to minimize the number of RPCs.

Performances

Two things have been benchmarked for the proxy: both tezos-client and tezos-node. tezos-client is supposed to perform worse when using the proxy mode, because it performs more computations locally while tezos-node is supposed to perform less computations. Benchmarks have been done using the scripts in tezos-bench.

Performances of tezos-client have been obtained by instrumenting the python tests and aggregating the durations of the various commands of the CLI. Here is the comparison of the mean of durations of commands using proxy mode and vanilla mode (done on Nomadic's benchmarks machine):

So the proxy client performs more or less similarly to the vanilla mode, in the context of the python tests.

Regarding performances of tezos-node, they have been obtained by executing this scenario with three participants:

A client that executes rpc get /chains/main/blocks/head/helpers/baking_rights?&all=true every second.
A client that transfers tez every second
A baker

Performances have been tracked by postprocessing the node's logs with TEZOS_LOG=rpc->debug. They are as follows (nomadic-labs/tezos@8b6f60e9):

There's a very high number of calls to /raw/bytes. To lower them, I added the following get parent heuristic in proxy.ml:

    (* If a subcontext of /rolls/owner/snapshot/i/j is requested, take /rolls/owner/snapshot/i/j instead *)
    | "rolls" :: "owner" :: "snapshot" :: i :: j :: tail ->
        Some (["rolls"; "owner"; "snapshot"; i; j], tail)

the performances are as follows (nomadic-labs/tezos@97dea858):

So the node is sadly spending more time when the clients are in proxy mode, because of the ../raw/bytes requests. This could be avoided by complexifying the nodes' deployments, for examples by putting caches dedicated to answer the ../raw/bytes queries in front of nodes (I tried this in python in branch smelc-issue-154-proxy-webcache but didn't finish in one day and I didn't want to spend more time on this).

Please note the following:

In the first screenshot there are close to 100K requests to /raw/bytes, the node spends 200 seconds honoring them.
In the second screenshot, the number of /raw/bytes requests is down to 25K requests, yet the node spends 450 seconds honoring them.

So it seems it's not the number of requests that matter the most; it's more the size of the requests that matter (i.e. the size of trees returned).

TODO: should we keep the optimization for /rolls/owner/snapshot/i/j? In my localhost scenario, it seems we shouldn't (since this makes the node spend more time). However in the real-world, lowering the number of RPC calls is more relevant because the network is much slower, hence this choice.

Where do all these RPCs come from?

The number of /raw/bytes RPCs is high. Its ratio with the closest RPC (/chains/<chain_id>/mempool/monitor_operations) is approximately 38 for 1 (23000 ./ 60). This is surprising so I've digged further, looking at data gathered by 5 minutes execution of tezos-bench's heavyduty.py. During this run, there are 5192 misses in the proxy cache (see data: smelc/tezos$2013245), i.e. the proxy does 5192 RPC /raw/bytes requests to the node. On these 5129 requests, 4522 yields a Cache node miss answer, meaning the requested key is NOT present in the node. All these keys are of the form rolls;owner;snapshot;i;j;k;l;m.

What is sad is that the get parent heuristic kicks in for these requests, hence the client is requesting rolls;owner;snapshot;i;j which is a large tree (typically 2500 nodes). It looks weird because such trees are typically in the cache already, however it could make sense to retrieve them again if they changed or have been extended.

Which brings me to my final remarks:

Can the value associated to a key change on the node? If yes we should never cache such values in the proxy. If a value can be extended (values are trees), we may handle cache misses differently. With the current implementation, we suppose that the value mapped by a key on the node never changes.
Is it safe to do queries using the block identifier head? The corresponding block may change on the node while the client is running. We could instead request the identifier corresponding to head when seeing it the first time (if applicable), and then use the corresponding identifier. With the current implementation, head receives no special treatment.

Reducing the number of RPCs

The good news is: if we want the client to have an immutable view of the node (and I think we do), we can cache both received data AND misses. To cache misses, we have to be careful; because the tree of data is not helpful: if there's data for key a/b/c and key a/b/d, it does NOT mean key a/b has been requested already, maybe a/b/c and a/b/d have been.

That is why I've introduced the RequestsTree module to keep track of requests done already. It is a tree whose nodes are either of type Partial or type All. All nodes are only leaf nodes. A key mapped to All means that a request for this exact key has been done already, there's no point redoing it, nor asking for a longer key (this is true because we want an immutable view of the node). Hence, in this scenario, a/b/c and a/b/d are mapped to All while a and a/b are mapped to Partial. If later on the request a/b is done, then the tree shrinks to a/b being mapped to All.

Using the requests tree, the number of RPCs in my 5 minutes scenario is divided by 4 (1363 * 4 = 5452 ~= 23000 / 4) and makes the node have better performances (data obtained on this commit):

⚠ This benchmark was obtained on Nomadic's bench machine (whereas previous benchmarks of tezos-node have been obtained on my machine) and there is a single 5 minutes run, while there are four such runs in previous screenshot. I've put a single run here because most bars get quite small otherwise, making the chart harder to read; means of 4 runs is here.

The node performs less computations for some RPCs (notably .../monitor_operations, /injection/block, baking_rights and .../context/contracts). It performs twice as many calls for /chains/<chain_id>/<block_id>/header, which is expected because it's a RPC used by the proxy to initialize its state. .../raw/bytes is still the busiest RPC, but in production it would be mitigated by caching it with a HTTP proxy.

Tests

Tests are implemented in tests_python/tests/test_proxy.py and in src/lib_proxy/test/test_proxy.ml.

The python tests do the following:

There are tests to check that an RPC is done locally or delegated to the node: test_chain_block_context_delegates, test_chain_blocks, test_network_self.
Tests of test_rpc.py are executed in proxy mode: TestAlphaProxyRPCs
test_compare checks that the vanilla and proxy client return the same data on a number of RPCs, when the vanilla client delegates the RPC while the proxy does it locally. This test is the most important one.
test_context_suffix_no_rpc checks that the proxy caches the results of RPCs correctly: it never does an RPC for a key klong that is a suffix of a key kshort, and data for kshort was retrieved already.
test_cache_at_most_once checks that the proxy's cache is created at most once for a given (chain, block) pair. This behavior is implemented in proxy_services.

The alcotest tests the following:

That the implementation of RequestsTree is correct: it tests empty, add, and find_opt (test_tree).
That proxy_getter's implementation of proxy_getter.M honors the split_key function i.e. that the take a parent tree heuristic is done correctly (test_split_key_triggers).
That caching of data is correct, i.e. that not too many RPCs are done (test_do_rpc_no_longer_key).

The coverage report was generated as follows:

./scripts/instrument_dune_bisect.sh src/lib_proxy/dune src/proto_alpha/lib_client/dune src/proto_genesis/lib_client/dune src/lib_client_base_unix/dune src/lib_mockup/dune src/lib_mockup_proxy/dune src/lib_protocol_environment/dune
make
(cd tests_python/ && pytest tests/test_proxy.py)
(cd tests_python/ && pytest tests/test_mockup.py)
(cd tests_python/ && pytest tests/test_rpc.py)
dune exec tezt/tests/main.exe -- --file proxy.ml
dune build @src/lib_proxy/runtest
make coverage-report

The main files of the proxy have the following coverage:

proxy_getter.ml has 84% coverage.
proxy.ml has 90% coverage.

Weaknesses

The proxy's RPC_client.ml detects that an RPC cannot be done locally by catching the Local_RPC_error (Rpc_not_found _) Error. I'm not super fan of catching errors but I've left it like this now because it is simple and this error is specific to the client being delegated to. We could avoid this weakness by making the delegate client (src/lib_mockup/RPC_client.ml) expose an API tailored for the proxy (by transforming this error into a usual return value). It would be weird though.

A rewrite of src/lib_mockup/RPC_client.ml is pending to avoid having to do this workaround on the mockup-fake-baking branch. Changing the way the proxy treats Local_RPC_error would be better done after this rewrite.
There are no tests checking that performances of the proxy mode stay viable. Attaining correct performances took some time, we should be careful to preserve them.

Should the proxy be the default mode?

Overall, the node is not performing better using the proxy mode. It could certainly be performing better if .../raw/bytes was served by dedicated HTTP proxies. But without complexifying the node's deployments, I think making the proxy the default mode isn't worth it. In terms of development, not making the proxy the default mode would also allow to merge it sooner; which will help keeping its development sane (by not sitting on top of a large stack of commits).

Note that making the proxy the default mode has been coded already (all benchs have been done in this setting), on branch smelc-issue-154-proxy-default-mode.

Reviewers

Suggested reviewers: @igarnier @rafoo_ @klakplok

Edited Dec 07, 2020 by Clément Hurlin

Add new client mode: proxy