Questionable cache in ExpressionTreeLocationInvariant
Instances of the class ExpressionTreeLocationInvariant
, which are used for storing invariants from a witness for k-induction, predicate abstraction, or for witness validation, contain a cache for ToFormulaVisitor
instances. The latter are used for converting invariants into formulas in the getFormula
method.
There are several possible issues related to this cache:
Thread Safety
A ConcurrentMap
is used for this cache, indicating that thread safety might be relevant. However, ToFormulaVisitor
itself (actually, its super class CachingVisitor
) uses a plain hash map for caching. This could in principle be ok (if it is guaranteed that each visitor is used only in a single thread and only multiple visitors need to be handled in a thread-safe way), but it is not clear and not documented.
Memory Leak
The cache stores references to PathFormulaManager
s, which typically again caches and stores references to lots of formulas, and of course to the solver. This can be a large memory leak, e.g., in the following situation: analysis A creates ExpressionTreeLocationInvariant
s and passes them to two analyses B and C, which both use the caching method getFormula
. Now lots of memory that belongs to B won't be able to get garbage collected as long as C stores references to the invariants, even if for example B is finished soon afterwards.
Design
Most code in CPAchecker follows the general principle of immutable data objects and separate classes that contain most of the code and store mutable state and context. This reduces memory consumption of the data objects (of which there are much more than instances of the code classes) and simplifies understanding of and reasoning about the data classes. It also naturally avoids issues like those discussed above. Thus it would be good if the class here would also follow this principle. Localized caches could be used by those consumers of ExpressionTreeLocationInvariant
where it is relevant and passed via a parameter. For example, in some cases it seems clear that a long-term cache (that stores visitors for later instead of just caching while traversing a single or a few related invariants) is not useful because the cache key contains a PathFormula
for context, which is unlikely to appear again in identical form.
This cached was added in d683c409.