Skip to content

Implement concurrent multihash for workhorse

Arran Walker requested to merge ajwalker/concurrent-multihash into master

What does this MR do and why?

Workhorse produces multiple checksums (md5, sha1, sha256, sha512) as artifacts are being uploaded on-the-fly.

These checksums are produced seqentially, with each chunk of data written to each hash implementation for every call to Write(). The time it takes to complete is the combined time of writes to all hash implementations.

This merge request introduces a concurrent multihash, with the goal being that the total time to complete should be closer to the time it takes our slowest hash implementation. This is achieved by using a global job queue to handle batches of writes. Benchmarks are included for each individual hash, sequential multihashing and concurrent multihashing.

To better simulation the scenario of multiple uploaders, with multiple files being hashed concurrently using the same global job queue, a noisey neighbour benchmark is also included. This reveals that in a worst-case scenario, we handle no worse than sequential hashing does, but when CPU is spare, we're much faster.

Benchmarks

8B-1 is a single 8 byte write. 8KiB-1 is a single 8 kibibyte write. 8KiB-1024 is 1024 8 kibibyte writes. The same applies to 64B and 64KiB.

Each individual hash

cpu: Intel(R) Xeon(R) CPU @ 3.10GHz
BenchmarkHashes/64B-x1024-md5-8       13          87738880 ns/op         764.87 MB/s          16 B/op          1 allocs/op
BenchmarkHashes/64B-x1024-sha1-8      19          61092201 ns/op        1098.48 MB/s          24 B/op          1 allocs/op
BenchmarkHashes/64B-x1024-sha256-8    7          155338042 ns/op         432.02 MB/s          32 B/op          1 allocs/op
BenchmarkHashes/64B-x1024-sha512-8    10         104396540 ns/op         642.83 MB/s          64 B/op          1 allocs/op

Sequential and Concurrent

cpu: Intel(R) Xeon(R) CPU @ 3.10GHz
BenchmarkHashes/8B-x1-multi-sequential-8        612009  1783 ns/op        4.49 MB/s        1640 B/op         19 allocs/op
BenchmarkHashes/8B-x1-multi-concurrent-8        775539  1558 ns/op        5.14 MB/s        1024 B/op         15 allocs/op
BenchmarkHashes/8B-x1024-multi-sequential-8     13988   85764 ns/op       95.52 MB/s       1640 B/op         19 allocs/op
BenchmarkHashes/8B-x1024-multi-concurrent-8     19220   62258 ns/op       131.58 MB/s      1024 B/op         15 allocs/op
BenchmarkHashes/8KiB-x1-multi-sequential-8      22801   52556 ns/op       155.87 MB/s      1640 B/op         19 allocs/op
BenchmarkHashes/8KiB-x1-multi-concurrent-8      28068   42676 ns/op       191.96 MB/s      1024 B/op         15 allocs/op
BenchmarkHashes/8KiB-x1024-multi-sequential-8   22      51304800 ns/op    163.51 MB/s      1640 B/op         19 allocs/op
BenchmarkHashes/8KiB-x1024-multi-concurrent-8   30      38324398 ns/op    218.88 MB/s      1219 B/op         15 allocs/op
BenchmarkHashes/64B-x1-multi-sequential-8       528889  2103 ns/op        30.43 MB/s       1640 B/op         19 allocs/op
BenchmarkHashes/64B-x1-multi-concurrent-8       630391  1924 ns/op        33.27 MB/s       1024 B/op         15 allocs/op
BenchmarkHashes/64B-x1024-multi-sequential-8    2466    484536 ns/op      135.26 MB/s      1640 B/op         19 allocs/op
BenchmarkHashes/64B-x1024-multi-concurrent-8    2895    397874 ns/op      164.72 MB/s      1024 B/op         15 allocs/op
BenchmarkHashes/64KiB-x1-multi-sequential-8     2948    404442 ns/op      162.04 MB/s      1640 B/op         19 allocs/op
BenchmarkHashes/64KiB-x1-multi-concurrent-8     5845    190253 ns/op      344.47 MB/s      1025 B/op         15 allocs/op
BenchmarkHashes/64KiB-x1024-multi-sequential-8  3       408774834 ns/op   164.17 MB/s      1640 B/op         19 allocs/op
BenchmarkHashes/64KiB-x1024-multi-concurrent-8  6       180703082 ns/op   371.38 MB/s      2001 B/op         16 allocs/op

Noisey Neighbour

cpu: Intel(R) Xeon(R) CPU @ 3.10GHz
NoiseyNeighbours/64KiB-x1024-multi-sequential-8  1 2880417382 ns/op  23.30 MB/s  67952 B/op  675 allocs/op
NoiseyNeighbours/64KiB-x1024-multi-concurrent-8  1 821732690 ns/op   23.78 MB/s  204504 B/op 824 allocs/op

How to set up and validate locally

go test -bench Benchmark ./internal/filestore/multihash

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Arran Walker

Merge request reports