Skip to content

refs/reftable: optimize write performance

Patrick Steinhardt requested to merge pks-reftable-optimize-writes into master

This is my first patch series taking an actual look at write performance for the reftable library. This series addresses two major points:

  • Duplicate directory/file conflicts when writing refs.
  • Allocation churn when compressing log blocks.

Overall though I found that there is not much of a point to investigate write performance in the reftable library itself. This is mostly because the write performance is heavily dominated by random ref reads. And while past patch series have optimized scanning through refs linearly, seeking random refs isn't well-optimized yet. I'll thus put my focus on random ref reads next.

For some context, here's the comparison to the "files" backend for writing many refs in a single transaction:

Benchmark 1: update-ref: create many refs (refformat = files, refcount = 100000)
  Time (mean ± σ):     10.085 s ±  0.057 s    [User: 1.876 s, System: 8.161 s]
  Range (min … max):   10.013 s … 10.202 s    10 runs

Benchmark 2: update-ref: create many refs (refformat = reftable, refcount = 100000)
  Time (mean ± σ):      2.768 s ±  0.018 s    [User: 1.381 s, System: 1.383 s]
  Range (min … max):    2.745 s …  2.804 s    10 runs

Summary
  update-ref: create many refs (refformat = reftable, refcount = 100000) ran
    3.64 ± 0.03 times faster than update-ref: create many refs (refformat = files, refcount = 100000)

And for writing many refs sequentially in separate transactions:

Benchmark 1: update-ref: create refs sequentially (refformat = files, refcount = 10000)
  Time (mean ± σ):     40.286 s ±  0.086 s    [User: 22.241 s, System: 17.912 s]
  Range (min … max):   40.166 s … 40.410 s    10 runs

Benchmark 2: update-ref: create refs sequentially (refformat = reftable, refcount = 10000)
  Time (mean ± σ):     44.046 s ±  0.137 s    [User: 23.790 s, System: 20.146 s]
  Range (min … max):   43.813 s … 44.301 s    10 runs

Summary
  update-ref: create refs sequentially (refformat = files, refcount = 10000) ran
    1.09 ± 0.00 times faster than update-ref: create refs sequentially (refformat = reftable, refcount = 10000)

This is to the best of my knowledge last area where the "files" backend outperforms the "reftable" backend.

Closes #289 (closed).

Merge request reports