Skip to content

reftable/stack: use geometric table compaction

Justin Tobler requested to merge jt/reftable-fix-segment-reset into master

To reduce the number of on-disk reftables, compaction is performed. Contiguous tables with the same binary log value of size are grouped into segments. The segment that has both the lowest binary log value and contains more than one table is set as the starting point when identifying the compaction segment.

Since segments containing a single table are not initially considered for compaction, if the table appended to the list does not match the previous table log value, no compaction occurs for the new table. It is therefore possible for unbounded growth of the table list. This can be demonstrated by repeating the following sequence:

git branch -f foo

git branch -d foo

Each operation results in a new table being written with no compaction occurring until a separate operation produces a table matching the previous table log value.

To avoid unbounded growth of the table list, walk through each table and evaluate if it needs to be included in the compaction segment to restore a geometric sequence.

Some tests in t0610-reftable-basics.sh assert the on-disk state of tables and are therefore updated to specify the correct new table count. Since compaction is more aggressive in ensuring tables maintain a geometric sequence, the expected table count is reduced in these tests. In reftable/stack_test.c tests related to sizes_to_segments() are removed because the function is no longer needed. Also, the test_suggest_compaction_segment() test is updated to better showcase and reflect the new geometric compaction behavior.

Related: #258 (closed)

Edited by Justin Tobler

Merge request reports