Craig Ferguson requested to merge CraigFe/tezos:upgrade-to-irmin-3.0 into master Feb 16, 2022

Context

This MR upgrades lib_context to use the newly-released Irmin 3 for storage. The corresponding opam-repository MR is opam-repository!242 (merged).

Performance impact

Briefly, the main impact of this change is that context reads are now much more efficient due to bypassing the index when following internal pointers in the file. With Irmin 2.10, the process for reading an object at some path [ "foo"; ... ] in the context tree is as follows:

Lookup "foo" in the root node to discover the hash of the child node named "foo";
Lookup the hash in the index (by interpolation search over subregions of the index data file) to discover the offset of the child node in the pack file;
Read the child node from the pack file, repeating this process for any remaining segments in the path to navigate down the tree.

With Irmin 3, this process is streamlined to:

Lookup "foo" in the root node to discover the offset of the child node named "foo" in the pack file;
Read the child node from the pack file and repeat for all segments.

This means that read operations on intermediate nodes and blobs never need to search over the index, saving both time and space in the system page cache.

Change of on-disk format

In order to support the described read mechanism, Irmin 3.0 introduces new node object types for the pack file that are slightly larger than the older ones (~ 1-2%). This slightly increases the storage requirements of newly-bootstrapped nodes, but it also unlocks a very significant reduction that will come in a follow-up MR (see "Indexing strategies" below).

Irmin 3.0 is backwards-compatible with the existing store format. Users of older stores will simply get the existing less-efficient read behaviour for nodes that were exported by Irmin 2, and any newly-added nodes will use the new format instead. Once an Irmin 3 object has been added to the Irmin 2 store, it is no longer possible to downgrade and use the same store (and attempting to do so will result in an exception when opening the store).

There is currently no dedicated migration process for converting an Irmin 2 store to an Irmin 3 store that uses all-new objects (and so takes advantage of the better read performance). The expectation is that it's fine for the old node objects to have a natural half-life as new bootstrapped stores and snapshot imports no longer export them. It's possible for a user to force this to occur by doing a snapshot export / import cycle.

Indexing strategies

One consequence of bypassing the index on read operations is that it is no longer necessary to index every object in the pack file: newly-exported intermediate nodes and blobs can always be found via direct pointers from their parents, so finding them by hash is not necessary. In fact, all object types other than commits can be omitted from the index completely. This stands to substantially reduce the size of users' contexts and make write-intensive workloads such as snapshot import and bootstrapping much more efficient, at the cost of not having perfect hash-consing at the storage layer.

Irmin 3 comes with an option for the user to specify which objects to add to the index during writes. For now, this option is set to the existing behaviour of indexing every object. I intend to make a follow-up MR recommending a change to a minimal indexing strategy, but this is not strictly related to the Irmin 3 upgrade. The speed improvements for snapshot import and bootstrapping will come as a result of that.

Impact on `lib_proxy`

Irmin 3 introduces two changes that impact lib_proxy:

the type of Tree.shallow has become more general to support backend stores that have keys other than hashes. This feature isn't used by the proxy client (since it uses an in-memory backend), but the signatures in lib_context must be refactored to allow lib_proxy to continue using shallow trees.
calling Tree.list no longer returns the empty list for dangling tree nodes, and instead raises an exception. This has been fixed by adding a new is_shallow function that lib_proxy can use to guard its calls to Tree.list.

These changes are both introduced by a single commit.

Manually testing the MR

Any operations that make intensive or unusual use of the context are good candidates for manual testing. For instance:

bootstrapping a fresh store with Irmin 3 and checking that the node functions as expected;
importing/exporting a snapshot to/from an Irmin 3 store.

Checklist

Document the interface of any function added or modified (see the coding guidelines)
Document any change to the user interface, including configuration parameters (see node configuration)
Provide automatic testing (see the testing guide).
For new features and bug fixes, add an item in the appropriate changelog (docs/protocols/alpha.rst for the protocol and the environment, CHANGES.rst at the root of the repository for everything else).
Select suitable reviewers using the Reviewers field below.
Select as Assignee the next person who should take action on that MR

Edited Feb 28, 2022 by Craig Ferguson

lib_context: update to use Irmin 3.1