Skip to content

Protocol: context flattening of indexed storages

Context

The commits flatten the "indexed storages": the indices of Contract_repr, Roll_repr, and Big_map are without their intemediate hex/digit directories.

The migration code in init_storage.ml renames the directories of these types:

  • /contracts/index/xx/xx/xx/xx/xx/xx/yyyyyyyy => /contracts/index/yyyyyyyy
  • /contracts/index/xx/xx/xx/xx/xx/xx/yyyyyyyy/delegated/xx/xx/xx/xx/xx/xx/zzzzzzzz => /contracts/index/yyyyyyyy/delegated/zzzzzzzz
  • /big_maps/index/xx/xx/xx/xx/xx/xx/n => /big_maps/index/n
  • /rolls/index/x/y/n => /rolls/index/n
  • /rolls/owner/current/x/y/n => /rolls/owner/current/n
  • /rolls/owner/snapshot/n1/n2/x/y/n3 => /rolls/owner/snapshot/n1/n2/n3
  • /commitments/xx/xx/xx/xx/xx/xxxxxxxx => /commitments/xxxxxxxxxxxxxxxxxx
  • /votes/listings/kk/xx/xx/xx/xx/xx/xxxxxxxxxx => /votes/listings/kk/xxxxxxxxxxxxxxxxxxxx
  • /votes/ballots/kk/xx/xx/xx/xx/xx/xxxxxxxxxx => /votes/ballots/kk/xxxxxxxxxxxxxxxxxxxx
  • /votes/proposals_count/kk/xx/xx/xx/xx/xx/xxxxxxxxxx => /votes/proposals_count/kk/xxxxxxxxxxxxxxxxxxxx
  • /votes/proposals/xx/xx/xx/xx/xx/xxxxxxxxxx/kk/yy/yy/yy/yy/yy/yyyyyyyyyy => /votes/proposals/xxxxxxxxxxxxxxxxxxxx/kk/yyyyyyyyyyyyyyyyyyyy
  • /delegates/kk/xx/xx/xx/xx/xx/xxxxxxxxxx => /delegats/kk/xxxxxxxxxxxxxxxxxxxx
  • /active_delegates_with_rolls/kk/xx/xx/xx/xx/xx/xxxxxxxxxx => /active_delegates_with_rolls/kk/xxxxxxxxxxxxxxxxxxxx
  • /delegates_with_frozen_balance/kk/xx/xx/xx/xx/xx/xxxxxxxxxx => /delegates_with_frozen_balance/kk/xxxxxxxxxxxxxxxxxxxx

This flattening changes the context hash. It also depends on the hash calculation of Irmin for bigger directories (nelems > 256 (or 255?)) which is being changed.

RPC incompatibility

It changes the output of RPC $blockid/contexts/raw/bytes/....

How to test

I test the code in the following way:

  1. Download the latest snapshot of mainnet rolling from https://mainnet.xtz-shots.io/
  2. Import it by ./tezos-node snapshot import --data-dir $HOME/tezos-node-test tezos-mainnet-xxxxx.rolling
  3. dune exec ./tezt/manual_tests/main.exe -- migration --verbose

After the migration to forge a block, the test fails caused by an unrelated reason:

[01:52:55.070] [node2] Mar 31 10:52:55.071 - alpha: migration finished
[01:53:21.022] [client2] Mar 31 10:53:21.022 - 008-PtEdo2Zk.baking.forge: found 0 valid operations (0 refused) for timestamp 2021-03-30T02:08:12.000-00:00 (fitness 01::00000000000b7670)
[01:53:22.052] [client2] Fatal error:
[01:53:22.052] [client2]   "Assert_failure src/lib_crypto/p256.ml:292:28"
[01:53:22.061] client2 exited with code 1.

Migration benchmark

I think the code is the fastest possible. Using trees and no Data_encoding decode of path names.

It takes around 2 mins in my MacBook Pro (13-inch, 2017), 2.3GHz Core i5, 16GB memory, SSD for the latest mainnet rolling snapshot.

Protocol environment change

The MR redefines the path encodings of Blake2B and public key hashes in the protocol. To hide the original path encodings defined in the shell, their signatures in the following are dropped in protocol environment V3:

(* S.INDEXES *)

val to_path : t -> string list -> string list
val of_path : string list -> t option
val of_path_exn : string list -> t
val prefix_path : string -> string list
val path_length : int

Related MRs

Checklist

  • CI
  • Find the error cause of P256 : it is an unrelated issue with this MR.
  • Check the result keeps all the data: checked the data are properly accessed via the new storage functions and number of items are preserved.
  • Check all the flattenable directories are flattened: path_lengths in proto_alpha are all minimum. Blake2B and Signature's path_lengths in the shell are hidden in proto env V3.
  • Benchmark the migration in other machines and bigger data
  • Benchmark the node performance after the flattening <- This should be fun!
  • Remove the excess log messages: now minimized
  • Think about the encoding of the public key hashing kinds: ed25519, secp256k1, and p256, currently done in their names kk. We concluded that we keep them as independent directories.
  • Evaluate cost reduction per operation
  • Wait Irmin supports the new hash version is incorporated into Tezos
  • Check all the CI green with Irmin 2.7.0
  • Wait the fix of memory context hash issue for huge directories
  • Port to proto_alpha based on Granada
  • Review the storage read costs and fix some hard coded depths
Edited by Jun Furuse

Merge request reports