Skip to content

Memory manager can be tricked into N SysOSAlloc + N SysOSFree calls with 3~4N plausible operations.

While investigating why !499 (merged) gives 10~50× speedup instead of the expected 2× I found that the main loop in this program:

{$mode objfpc}
const
	Reps = 10000;
	Size1 = 300;
	Size2 = 800;

var
	i: int32;
	p1, p2: pointer;

begin
	for i := 1 to Reps do
	begin
		p1 := GetMem(Size1);
		p2 := GetMem(Size2);
		FreeMem(p1);
		FreeMem(p2);
	end;
end.

calls Reps SysOSAllocs and ≈Reps SysOSFrees. My arguments why this shouldn’t be the case are:

  1. SysOSAlloc/SysOSFree (and probably other heap.inc machinery related to OS chunks?) are nontrivially slow by themselves (that’s why you have a memory manager in the first place!), but especially slow under the debugger: for me, SysOSAlloc+SysOSFree run in 38+6 µs (up from 100+100 ns; Linux might have something different), making the entire loop run in about half a second (rather than instantly). I think I’ve noticed this (and you may have as well) countless times but didn’t attribute to HeapAlloc/HeapFree. Of course, my numbers from !499 (merged) are without the debugger (they were worse by another 5× or so).

  2. Isn’t this supposed to be kept at bay by MaxKeptOSChunks which is a whole 4 by default.

Edited by Rika
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information