LLVM - Pointer values of procedural variable constants cast from integers are truncated
Summary
When constructing procedural variable constants from integer values (with the help of {$modeswitch pointertoprocvar}),
the LLVM backend truncates the stored pointer value to some common word size (i.e. 8, 16, 32 and 64 bit words)
depending on the number of leading (i.e. high) set bits.
Quite a mouthful, isn't it? You may ask yourself: Why is this relevant? Why would you need to cast integers into function pointers?
Well, the reason I stumbled upon this bug in the first place is the following definition from SQLite3:
typedef void (*sqlite3_destructor_type)(void*);
#define SQLITE_TRANSIENT ((sqlite3_destructor_type) -1)In my own SQLite3 header translation, I expressed this in Pascal like so:
{$modeswitch pointertoprocvar}
type
psqlite3_destructor = procedure (block: pcvoid); SQLITE_API;
const
SQLITE_TRANSIENT: psqlite3_destructor_type = Pointer(-1);(I want SQLITE_TRANSIENT to be usable without requiring that both a cast to psqlite3_destructor
and {$modeswitch pointertoprocvar} be present at the site of use, meaning that the constant has to be typed.)
Running an LLVM-compiled application that makes use of SQLite3 (and hence this constant) crashes with
an access violation at address 0x00000000000000FF because - as I had to find out - the value of
SQLITE_TRANSIENT stored in the binary isn't 0xFFFFFFFFFFFFFFFF, but instead 0x00000000000000FF,
leading to SQLite3 attempting to use it as a genuine function pointer.
Changing Pointer(-1) to Pointer(High(PtrUInt)), Pointer(not PtrUInt(0)) or similiar tricks all have no effect.
After some testing, I found out that procvar constants get truncated according to the following strange pattern, where:
.is a nibble with any bit pattern (i.e.0bxxxx)+is a nibble that has its high bit set (i.e.0b1xxx)+is a nibble that has its high bit unset (i.e.0b0xxx)
0xFFFFFFFFFFFFFF+. -> 0x00000000000000+.
0xFFFFFFFFFFFFFF-. -> 0x000000000000FF-.
0xFFFFFFFFFFFF+... -> 0x000000000000+...
0xFFFFFFFFFFFF-... -> 0x00000000FFFF-...
0xFFFFFFFF+....... -> 0x00000000+.......
0xFFFFFFFF-....... -> 0xFFFFFFFF-.......Some example values:
0xFFFFFFFFFFFFFFFF -> 0x00000000000000FF0xFFFFFFFFFFFFFF80 -> 0x00000000000000800xFFFFFFFFFFFFFF7F -> 0x000000000000FF7F0xFFFFFFFFFFFFAAAA -> 0x000000000000AAAA0xFFFFFFFFFFFF2AAA -> 0x00000000FFFF2AAA
System Information
- Operating system: Linux (Debian 12.10.0)
- Processor architecture: x86_64
- Compiler version: Trunk @
d3975bd65e6b4865abeaa0ce3448bb2df7b67ba8and LLVM 16
Steps to reproduce
Compile & run the following program (using an LLVM-enabled version of FPC):
Example Project
program llvm_procvar_const;
{$C+}
{$modeswitch pointertoprocvar}
const
INTEGER_VALUE = $FFFFFFFFFFFFFFFF;
POINTER_VALUE: TProcedure = Pointer(INTEGER_VALUE);
var
Ptr, Int: String;
begin
Ptr := HexStr(POINTER_VALUE);
Int := HexStr(PtrUInt(INTEGER_VALUE), SizeOf(Pointer) * 2);
WriteLn('Pointer(', Ptr, '), PtrUInt(', Int, ')');
Assert(Ptr = Int);
end.