Better “Format” (if you don’t mix encodings... but probably even if you do). (!407) · Merge requests · FPC / FPC / FPC Source

Rika requested to merge runewalsh/source:format into main Apr 27, 2023

I acknowledged about the undesirability of touching this before the unicode branch merged, but hey, this thing already supports all character and string types, doesn’t it?

While #40247 (closed) formally wanted the impossible, I think he had a point. I attempted to redo Format to avoid most of the allocations by copying pieces directly to the destination (instead of concatenating temporary strings) and by using shortstring Str() versions for 1-byte character strings. On the following synthetic example:

Format('A: %s; B: %s; C: %d; D: %d; E: %x', ['a', 'bb', 123, 456, 789]);

my version calls 1 GetMem and 1 ReallocMem (my heuristic overestimates the length of the result, so the final SetLength ends up shrinking the memory chunk with ReallocMem...), while existing version calls 9 GetMems, 3–6 ReallocMem, and 8 FreeMems. As a consequence, my version is more than 3× faster here (300 ns/call vs 1 µs/call).

Even on the following:

Format('A: %s', ['hi']);

my version does considerably less allocations (1 GetMem + 0–1 ReallocMem vs 3 GetMem + 2 FreeMem) and is more than 2× faster (100 vs 250 ns/call).

Strings with 2 bytes per character miss allocation shortcuts related to shortstrings for integers (they could have these shortcuts with more code, using the fact that such strings are usually ASCII), but still have greatly reduced allocation counts and close speedups. For example,

UnicodeFormat('A: %s; B: %s; C: %d; D: %d; E: %x', ['a', 'bb', 123, 456, 789]);

works in 28 GetMems and 2.4 µs/call before the patch, 6 GetMems and 700 ns/call after the patch.

However, to support codepage-aware strings in 1-byte Format, if ansistring whose encoding differs from DefaultSystemCodePage is encountered, the entire Format is redirected to UnicodeFormat. It will usually be faster anyway, as existing Format would instead concatenate strings with different encodings, potentially converting them to and from Unicode again and again (that’s how ansistr_concat_complex works). In the first example, replacing 'bb' with bbExotic where bbExotic := 'bb'; SetCodePage(bbExotic, 866, false); remains better with 9+1 vs 12+6 (re)allocations and 1 vs 1.3 µs/call, but counterexamples must exist, at least because I don’t do a prepass for such strings and instead fall back from the middle of the work, throwing away the part already done in ANSI.

I also changed several things:

Unknown format specifiers throw errors instead of being removed from the string (Format('Q=[%q]', ['hi']) = 'Q=[]'). Delphi does the same.
Format('%*:s', [High(int32) + 1]) doesn’t crash, i.e. it throws a language exception instead of a CPU exception. Delphi does the same.
Padding with spaces is not limited to 255 spaces, but a sane limit (1000 spaces) is imposed to prevent accidentally consuming all RAM on bad input (Delphi doesn’t limit it at all). Previously, Space() padded with N spaces modulo 256, which might be surprising. Numbers are padded to 32 (64-bit) or 16 (32-bit) digits at most, greater paddings are treated as no padding (like Delphi does).
Error messages highlight the position with |. I find it useful if you made a typo and wonder where.

Better “Format” (if you don’t mix encodings... but probably even if you do).

Merge request reports