Improvments for TDictionary (Optimizations / less repeated hash calculations / some new methods) (!988) · Merge requests · FPC / FPC / FPC Source

The 3 commits reuse existing hash belong together.

If a dictionary is resized, then the hashes are already stored (except for those that use more than one hash). So there is no need to calculate them again.

However rehash could be used by subclasses that have the ability to change their hash algorithm. Then hashes need to be calculated again.

Any such class would not override the new FindBucketIndexForHash
FindBucketIndexForHash takesvar AHash and forwards to the existing method. The old behaviour is kept.

If any such class wants to override it, there are 2 further commit that add optional protections.

A wrapper method, that sets a flag in a private field.
The wrapper is only called for resizing operations.

Use the existing AForce to also force hash recalculations.
Since AForce is used to rehash even if the size did not change, then this behaviour seems consequent.

When doing TryAdd or AddOrSet the hash and bucket index are calculated twice. First to find any existing, and second for adding. As it is the same key, the hash can be re-used.

If PrepareAddingItem changed the size, then the index is recalculated. The hash can still be kept.
If any subclass needs the hash to be recalculated too, it can do so by not subclassing FindBucketIndexForHash, or taking care inside its overridden method

GetOrAddMutableValue

Currently there is

GetMutableValue which doesn't allow adding
AddOrSetValue which does not allow keeping the old value.
TryAdd which also does not allow keeping the old value.

But for any app that wants to

Get an existing value
If not found, create an new value ( which may be costly ) and then add it

This task currently must calculate the hash for the key twice (and search the bucket index twice).

The new method allows this with a single hash calc (and in most cases single search for bucket index). If not found the caller gets a pointer to an empty value slot, and can then create the new value and store it.

GetKeyFromBucket

Please see https://lists.freepascal.org/pipermail/fpc-devel/2025-April/045928.html

Allows subclasses to access keys and values for a bucket index.

In the example case

the key is pointer to some data (the data is hashed).
an independent copy of the same data may exist
given the pointer to the new data, the existing key may need to be found

A subclassed dictionary can call FindBucketIndex to see if the key exists. With the new methods it can access the existing key.

overloaded FindBucketIndex

Since subclasses do not have access to FItems they can not call the overload that returns the hash for the key.

If subclasses want to implement their own GetOrAdd variants, then the need the hash returned.

Improvments for TDictionary (Optimizations / less repeated hash calculations / some new methods)

Merge request reports