inplace ops are not happening for simple operations
I have been trying to benchmark a WIP branch of the apint
crate against num-bigint
and rug
. rug
does outperform apint
and num-bigint
for addition, but only after the integer lengths get into the thousands of bits.
Just returning a clone of a small Integer
takes about 70 ns/iter, just like apint
and num-bigint
do.
const HEX: &str = "17f3feabf73e71234"; //this is just over 64 bits, because if it were smaller, small value optimization would kick in for `apint` at least
#[bench]
fn allocate_rug(b: &mut Bencher) {
// types are explicitly annotated to make sure that the inputs and outputs are completed operations
let int0: Integer = black_box(Integer::from(Integer::parse_radix(HEX, 16).unwrap()));
b.iter(|| {
let o: Integer = Integer::from(&int0);
o
})
}
However, once anything else is done, another allocation appears to happen and the time for rug
doubles, putting it way behind in the benchmarks. This seems to happen for every thing I tried, except for repeatedly adding like Integer::from(&int0) + &int0 + &int0;
, where each extra addition after the first addition adds only 20ns instead of 70ns. Maybe the optimizer is doing something special and I should really use custom allocators to count the actual number of allocations happening, but either way something is wrong with the performance.
I like the idea of having incomplete operations that should cut down on the number of functions needed and automatically select the most optimal assignment or allocation, but I feel like the documentation is missing some details, like why does
Integer::from(&int0) + &int0 + &int0
work but not &int0 + &int0 + &int0
, and why the first form doesn't require an outer Integer::from
as in Integer::from(Integer::from(&int0) + &int0 + &int0)
.