reduce RSSubCode allocs
This reduces allocs by about 50%:
master:
BenchmarkRSSubCodeRecover-4 247.51 MB/s 241153 B/op 1280 allocs/op
this branch:
BenchmarkRSSubCodeRecover-4 277.40 MB/s 138240 B/op 617 allocs/op
Unfortunately we cannot do much better than this without modifying the reedsolomon
package. reedsolomon
allocates some temporary buffers during decoding and passes them to different goroutines, forcing them to be heap-allocated. This isn't a big deal when you're decoding 4MB shards, but because the subcode calls Recover
on every segment, we end up doing hundreds of thousands of allocations per sector.
I was able to achieve zero-alloc encoding and decoding in us
by forking reedsolomon
and removing the parallelism. Such a change is unlikely to be accepted upstream, though, because larger decodes benefit from parallelism; most people aren't doing subcodes. So vendoring reedsolomon
may be our best bet here.
Edited by Luke Champine