Skip to content

reftable: prepare for re-seekable iterators

Patrick Steinhardt requested to merge pks-reftable-seek-refactorings into master

The reftable library uses iterators both to iterate through a set of records, but also to look up a single record. In past patch series, I have focussed quite a lot to optimize the case where we iterate through a large set of records. But looking up a records is still quite inefficient when doing multiple lookups. This is because whenever we want to look up a record, we need to create a new iterator, including all of its internal data structures.

To address this inefficiency, the patch series at hand refactors the reftable library such that creation of iterators and seeking on an iterator are separate steps. This refactoring prepares us for reusing iterators to perform multiple seeks, which in turn will allow us to reuse internal data structures for subsequent seeks.

The patch series is structured as follows:

  • Patches 1 to 5 perform some general cleanups to make the reftable iterators easier to understand.

  • Patchges 6 to 9 refactor the iterators internally such that creation of the iterator and seeking on it is clearly separated.

  • Patches 10 to 13 adapt the external interfaces such that they allow for reuse of iterators.

Note: this series does not yet go all the way to re-seekable iterators, and there are no users yet. The patch series is complex enough as-is already, so I decided to defer that to the next iteration. Thus, the whole refactoring here should essentially be a large no-op that prepares the infrastructure for re-seekable iterators.

The series depends on pks/reftable-write-optim at fa74f322 (reftable/block: reuse compressed array, 2024-04-08).

Part of reftable: allow reusing iterators across multip... (#272 - closed).

Edited by Patrick Steinhardt

Merge request reports