Lacks comprehensive benchmark suite
For us to make Obnam actually be fast, we need a benchmark suite that measures every interesting use case. What are they? I'll start a list, and we can collect more, and split off issues to block this one for each thing we want to measure.
For me, the two types of data I seem to have are large numbers of archived emails in Maildirs, and large files (video, VM images, and such). These can, I think, be synthesized into two extreme and simplistic cases:
- a directory tree with a large number of empty files
- emails in my archives tend to be small, the performance issues come from the numbger of files
- a large file without data duplication
Both types of data sets should be measured for both initial and incremental backup.
These are for system benchmarks. Both the client and the server should probably have their own benchmarks for specific aspects, such as:
- how does the client perform for chunking large amounts of data with different chunking parameters?
- how does the server perform when it has a large number of chunks, for lookups?
Ideally we will have a benchmark suite that we can run repeatedly, and conveniently.