💀 REWRITE HEAPTRC.PP 💀
- Use global state with locks instead of the
heap.inc-like thread-local approach. Well, see, while this may slow down esp. multithreaded scenarios (hopefully somewhat compensated by other changes), multithreaded performance is less important forheaptrcthan, for example, the ability to catch double-free immediately instead of queuing it up or whatever. There were already bugs associated with it attempting to be clever about multithreading; modified example from that MR:
{$mode objfpc} {$modeswitch anonymousfunctions}
var
th: TThreadID;
p: pointer;
begin
{$if declared(quicktrace)} quicktrace := false; {$endif} // doesn’t help
p := GetMem(1000);
th := BeginThread(
function(param: pointer): PtrInt
begin
FreeMem(p);
FreeMem(p); // MR throws an error here instantly.
result := 0;
end);
WaitForThreadTerminate(th, 0);
CloseThread(th);
writeln('still alive');
end. // Trunk throws an error *here*.
— doesn’t catch the double free in time because of this. In fact, trunk heaptrc sometimes fails to catch even a simple double free properly (it supposedly should, but doesn’t reach that with GetMem(1000); GetMem(100) works okay; didn’t investigate why):
var
p: pointer;
begin
p := GetMem(1000);
FreeMem(p);
// Trunk: segfault, unless KeepReleased is set
// (which might be impractical to the point of impossibility).
// MR: notes invalid p and throws RunError(204).
// With KeepReleasedBytes := 1024 * 1024, it also notes the freeing site.
FreeMem(p);
end.
My version catches any invalid frees guaranteedly, immediately, and “for free”, quicktrace is meaningless (and is removed).
- Use a separate hash table that stores data associated with allocations. This hash table is less likely to get corrupted, and reduces the effect of
heaptrcon the allocation pattern, so you are less likely to have heisenbugs that disappear on enablingheaptrc. (Now you see how exactly invalid frees get caught immediately, guaranteed, and “for free”: this hash table lookup is essential for the normal operation anyway...)
Example of such a heisenbug:
var
a: array of byte;
function GrowA(by: SizeInt): SizeInt;
begin
result := length(a);
SetLength(a, result + by);
end;
begin
SetLength(a, 500);
// Programmer error; compiler caches the “a” pointer before it calls GrowA,
// and has the right to do so.
a[GrowA(50)] := 123;
if a[length(a) - 50] = 123 then
writeln('OK.') // reached with trunk heaptrc
else
writeln('Bug.'); // reached without heaptrc, or with MR’s heaptrc
end.
- Stack traces are stored compressed and deduplicated with their own hashtable. Trunk
heaptrcrequiressizeof(pointer)per trace item and always allocates the maximum amount of trace items, which is 16 by default, so with 8-byte pointers they always consume 128 bytes per trace = per allocation.
Any code that does several allocations in a loop benefits from sharing stacktraces:
var
p: array of array of array of byte;
begin
{$if declared(IncludeInternalAllocationsInStatistics)}
IncludeInternalAllocationsInStatistics := true;
{$endif}
SetLength(p, 1000, 100, 10); // fpc_dynarray_setlength recursively loops over subarrays
writeln(GetFPCHeapStatus.CurrHeapUsed / 1024 / 1024:0:2, ' Mb used');
// No heaptrc: 3.85 Mb used. (32-bit: 3.45 Mb.)
// Trunk heaptrc: 26.92 Mb used. (32-bit: 12.73 Mb.)
// MR heaptrc: 9.11 Mb used. (32-bit: 5.29 Mb.)
end.
-
heaptrcnow incrementally checks all blocks for corruptions, not just ones being released. Example:
var
p: pByte;
begin
p := GetMem(100);
p[100] := 0; // corrupt p
writeln('reached a');
FreeMem(GetMem(100));
// MR notices p corruption over time;
// in this simple example, it is reliably detected at this point.
// Alternatively, CheckHeap([step]) can be called at any time.
writeln('reached b');
// trunk detects p corruption only when freeing p, which might not happen soon...
FreeMem(p);
writeln('reached c');
end.
New global variable, AutoCheckHeapStep, gives some control over this. Each GetMem / FreeMem / ReallocMem / MemSize call checks up to AutoCheckHeapStep allocations. Default is 3; setting AutoCheckHeapStep to High(SizeUint) will scan the entire heap every time; also you can manually call CheckHeap.
- Replace
KeepReleasedwithKeepReleasedBytes: SizeUint, the maximum size to be kept, instead of thebooleanthat keeps everything and often becomes impractical. It still defaults to zero because it affects the allocation pattern heavily.
I planned to demonstrate on the FPC itself but for some reason it crashes both with trunk and MR heaptrcs :D.
As an alternative, FormatJSON.pas modified from !1020 (comment 2515823273) seems a good illustration.
No heaptrc: runtime 4.2 s, peak memory 942.5 Mb.
Trunk heaptrc: runtime 79.8 s, peak memory 4235.0 Mb.
MR heaptrc: runtime 87.4 s, peak memory 1479.7 Mb.