As per title, the following are the benchmark results, without this change:
VerifyNestedIfScript, 5, 100, 0.134325, 0.000265831, 0.000271621, 0.000267673
and with this change:
VerifyNestedIfScript, 5, 100, 0.0695281, 0.000136347, 0.00014221, 0.000139664
as you can see we halved the execution time for a script containing 100 nested
ifs which is the maximum possible right now due to the 201 ops limit.
I think that this change would be pretty useful once BCH would lift the max ops per script constraint.
Original PR description:
While investigating what mechanisms are possible to maximize the per-opcode verification cost of scripts, I noticed that the logic for determining whether a particular opcode is to be executed is O(n) in the nesting depth. This issue was also pointed out by Sergio Demian Lerner in https://bitslog.wordpress.com/2017/04/17/new-quadratic-delays-in-bitcoin-scripts, and this PR implements a variant of the O(1) algorithm suggested there.
This is not a problem currently, because even with a nesting depth of 100 (the maximum possible right now due to the 201 ops limit), the slowdown caused by this on my machine is around 70 ns per opcode (or 0.25 s per block) at worst, far lower than what is possible with other opcodes.
This PR mostly serves as a proof of concept that it's possible to avoid it, which may be relevant in discussions around increasing the opcode limits in future script versions. Without it, the execution time of scripts can grow quadratically with the nesting depth, which very quickly becomes unreasonable.
This improves upon #14245 by completely removing the vfExec vector.