Optimize CFANode.get*Edges().contains() for extreme cases (!327) · Merge requests · SoSy-Lab / Software / CPAchecker

Typically, CFANodes have only 1-2 entering and leaving edges. There it does not matter what we use as a data structure, even an O(n) contains check is fast if n==2 (using a set would likely not be faster).

However, there are two cases where the number of entering and leaving edges is basically unbounded: entering edges on FunctionEntryNodes and leaving edges on FunctionExitNodes. And it turns out that because LocationTransferRelation does allLeavingEdges().contains(), this can be a huge performance problem (cf. #1388 (closed)). So we add an optimization for these two cases and use a set.

The main change is in 30a8db0b. The other commits are basically refactorings that are necessary to allow us to implement this: Only after moving the CFAUtils.*Edges() methods to CFANode, we can access the internal data structures of CFANode directly and also make use of dynamic binding and override only the cases that we want without affecting other cases.

This MR implements basically the minimal solution. In particular, it does not change the data structure used for storing entering and leaving edges of CFANodes in general. This is safer, because maybe sometimes we have edges that are treated as duplicates due to their equals() implementation and would be lost if we change to using a set. Maybe this case does not exist, but it would at least need further investigation.

For the two cases mentioned above, we are sure that there are no problems. We also do not need to do something special for allEnteringEdges()/allLeavingEdges(), because we know that for the two relevant cases there are no summary edges and thus the all*Edges() methods can just delegate to the variants without "all" in the name (this is also part of this MR).

We can also still wrap the set in a FluentIterable such that we can keep the same return type, because whenever possible the FluentIterable delegates calls to the wrapped data structure if the latter is a Collection.

So taken together, although it is not visible when looking at the types of the CFANode API and it might be a little bit brittle because it depends on a few pieces of code all fitting together in the right way, it does achieve the performance improvement that is necessary to fix #1388 (closed).

I do not have an exact 1:1 comparison of only this change, but here are tables comparing 4938c431 with the state of 30a8db0b on integration-nightly-svcomp: results.2025-09-17_09-39-35.diff.html results.2025-09-17_09-39-35.table.html We get ca. 40 more correct results that were timeouts before.

05_fuzzle_70x70.c, which triggered #1388 (closed), can be solved in 200s now. Part of that are still 27s for the transfer relation (plus likely a similar amount for the transfer relation of the TargetLocationProvider), out of which most is still caused by the many leaving function-return edges: While we optimized the contains() check in LocationTransferRelation, it is still the case that CompositeTransferLocation calls AbstractStateWithLocations.getOutgoingEdges() (which returns up to 4000 edges for this task). It then iterates over these and calls the transfer relations of all components, so in particular the LocationTransferRelation and the CallstackTransferRelation are still called up to 4000 times, and only the more costly transfer relations of other CPAs are not because CallstackTransferRelation rules out all but one edge. But this is harder to improve with our design of independent CPAs.

Of course we can in the future revisit this and maybe improve the data structures of the CFA more generally, but for now this is not needed.

Optimize CFANode.get*Edges().contains() for extreme cases

Merge request reports