Useless cclasses.TFPHashList revamp
In the last few years, the following hash table implementation became somewhat popular among runtime libraries: open-addressed array of indices that reference dense array of items.
indices = [2, -, −, 0, 3, −, T, 1]
items = [item0, item1, item2, item3]
This saves memory, as sizeof(item) > sizeof(index). Even if sizeof(index) is at its natural limit, sizeof(SizeUint), current cutils.THashItem worths three to four indices. Moreover, by adding a level of indirection index type can be made switchable, so that hash table with 256−2 elements (−2 are representations of empty cell − and tombstone T) could accomplish everything it needs with 1-byte indices, etc. It turns out that the gain just from using less memory fully compensates for the initial indirection cost, and then you can get yourself halfway closer back to the previous memory usage but have sparser table with less collisions.
Doing the same with cutils.TFPHashList is tricky because of different semantics regarding duplicates or removes on which other parts of the compiler have relied on (see e.g. comments on #23501 (closed)). Still, I tried to adapt that indices thing while fully preserving said semantics, and I have an impression that it works slightly but measurably faster, speeding up the compilation by 0,5% or so. But this might be false and I didn’t test it much (in particular if it works in symansistr mode at all), just tried to compile the compiler itself and my application for x86-64.
I also have changed the way how these strange permanent storages for strings attached to every TFPHashList work: instead of a single moveable block, they are allocated using non-moveable memory regions. For example, initially such a region contains (say) one 64-byte chunk. When it becomes too full to fit the next allocation request, another chunk of (say) 96 bytes is allocated and linked to it, and so on. This allows for hash records and TFPHashObject to conveniently store direct pointers to shortstrings instead of indices to FStrs, as their data will never be moved, only destroyed with the hash table itself.
Hypothetically, this can waste memory; imagine two allocations of 48 bytes, first comes into the first 64-byte chunk, and second allocates a new one, forever locking remaining 64−48=16 tail bytes of the first, as for the sake of speed and simplicity memory region will never consider them anymore (though it could). But in practice this problem does not exist: when compiling my application, max hash table size reaches three or four megabytes (not counting its string pool but counting everything else; string pool is a separate three or four megabytes, too). Meanwhile, its memory region wastes on these tails (not counting its last, active chunk) around 120 bytes. Of course, this is vastly less than potential amount of garbage from the fact that strings can be deleted from the hash table but are never deleted from the pool (I think I have seen around 30 Kb), or from the fact that reallocating a memory chunk requires a free space that is a (100+k)% multiplier of its size.
What is slightly less useless, while working on it I also found a minor bug and a suspicious place in the code.
Minor bug is that in scanner.pas, hash table of ignoredirectives is supposed to prevent repeated warnings about unknown compiler directives, but such directives are remembered by associating nil pointers with them. Even if this was later checked “correctly” with ignoredirectives.FindIndexOf(hs)>=0 instead of ignoredirectives.find(hs)<>nil, TFPHashList explicitly treats elements with associated nils as non-existing and will return −1 even from FindIndexOf. I have replaced nils with pointer(1).
Suspicious place is in ogbase.pas here. (F)ExeSectionList is not just a TFPHashList of pointers but TFPHashObjectList of objects, and is constructed with OwnsObjects=True. The thing is, assigning an item of TFPHashObjectList that OwnsObjects frees the old item. Right after this FExeSectionList is freed itself, so either this nilifying loop can be removed (if freeing old FExeSectionList contents is the desirable behaviour) or must be replaced with FExeSectionList.OwnsObjects:=false (it this is an error).