New OPDF debug format - a format specifically designed for the object pascal language
Hi everyone,
I'd like to propose adding a new, optional debug information format to FPC called OPDF (Object Pascal Debug Format). This is not a replacement for DWARF — it is an additional debug format that can be selected with the new -gO compiler flag, similar to how -gw selects DWARF.
A standalone reference debugger (PDR), the OPDF binary format library, and all research/design documentation are available at:
https://github.com/graemeg/opdebugger
NOTE: Some of the early analysis and design documentation predates the current implementation and may be slightly dated. I have kept it in to show the rationale and design evolution behind the format. This was my third attempt at getting this implementation right.
Why a new format?
DWARF is a mature and capable standard, but it was designed around C/C++ concepts. Object Pascal has language features that DWARF does not natively represent:
- Properties (field-backed and method-backed with getter/setter)
- Reference-counted strings (ShortString, AnsiString, UnicodeString)
- Sets (bitfield over ordinal/enum base types)
- Class inheritance with VMT, published RTTI, and interfaces (COM/CORBA)
- Dynamic arrays with SizeInt-based length metadata
Today, FPC encodes these into DWARF using workarounds and conventions that external debuggers (GDB, LLDB) do not understand. The result is that debugging Object Pascal with GDB often shows strings as raw byte pointers, properties as inaccessible, and sets as opaque integers.
OPDF encodes these concepts directly. A property record carries its getter method name and read/write kind. An AnsiString record carries its type ID so the debugger knows to dereference the pointer and read the length header. A set record carries its base enum type ID and lower bound so the debugger can decode the bitfield into member names.
The full rationale and analysis, including a comparison with DWARF and a look at Kylix and Delphi's approach, is documented at:
https://github.com/graemeg/opdebugger/blob/master/docs/analysis.adoc
What the compiler changes include
- New debug writer:
compiler/dbgopdf.pas(TOPDFDebugWriter) - Type ID allocation:
compiler/dbgopdf_typemap.pas(TTypeMapper) - New -gO flag in
compiler/options.pas - Registration in
compiler/systems.inc(dbg_opdf enum) - Target registration in
x86_64/cputarg.pasandi386/cputarg.pas
The debug writer emits an .opdf ELF section with all code addresses
resolved at link time by the assembler/linker. Each compilation unit
gets its own OPDF header, and the linker concatenates them. A
cross-module type deduplication system (via global TTypeMapper +
G_EmittedTypeIDs) ensures that shared types are emitted once, using
mangled type names for cross-unit dedup (following the same pattern as
DWARF in dbgdwarf.pas). A unit directory record at the end of the main
module lists all contributing units.
This has been tested with a 121-unit project producing a 8.0 MB .opdf section with a 1.1x dedup ratio. The debugger also handles cross-compilation TypeID collisions that arise when build systems like PasBuild invoke ppcx64 separately per module — colliding TypeIDs are remapped at load time.
The OPDF binary format specification (v0.3.0, 20 record types) is at:
https://github.com/graemeg/opdebugger/blob/master/docs/opdf-specification.adoc
IMPORTANT: dbgopdf.pas does NOT import any units from the opdebugger
repo. The REC_* constants are duplicated locally with "{ must match
opdf_types.pas }" comments. This keeps the compiler fully
self-contained.
What the reference debugger can do today
The PDR debugger is a standalone CLI tool that reads OPDF data and controls the debuggee via Linux ptrace. It currently supports:
- Breakpoints by file:line, hex address, or variable name
- Hit-count conditional breakpoints (break ... if count=N)
- Source-level stepping (step over and step into)
- Call stack display with function names and source locations
- Variable evaluation for all Object Pascal types: primitives, floats, enums, sets, ShortString, AnsiString, UnicodeString, pointers, static arrays, dynamic arrays, records, classes (with field display), interfaces
- Compile-time constant evaluation
- Property evaluation: field-backed (automatic) and method-backed (via call injection on explicit request)
- Array slice display: print arr[2..5]
- In-process variable assignment: set x = 42, set day = Monday
- Local variable listing with scope awareness (including nested procedures)
- Structured inspect command for records, classes, and interfaces
- Auto-print display list (display/undisplay)
- Hardware watchpoints via x86_64 debug registers (DR0-DR3)
- Break on exception raise with class name and message display
- Method call injection for getter property evaluation (x86_64 SysV ABI, register save/restore, INT3 sentinel, red zone handling, managed return types)
27 automated integration tests cover all of the above.
The debugger uses a hexagonal architecture (ports and adapters). The core engine depends only on interfaces (IProcessController, IDebugInfoReader, IArchAdapter) defined in pdr_ports.pas. This means integrating OPDF into an IDE like Lazarus or MSEide requires only implementing the adapter interfaces — the type evaluation, property resolution, breakpoint logic, call injection, and all display formatting are reusable as-is.
Architecture documentation:
https://github.com/graemeg/opdebugger/blob/master/docs/architecture.adoc
Design decisions log:
https://github.com/graemeg/opdebugger/blob/master/docs/design-decisions.adoc
Relationship with DWARF
To be clear: this proposal does not suggest removing DWARF support. DWARF remains valuable for interoperability with standard tools (GDB, LLDB, Valgrind, perf), for platforms where OPDF is not yet supported, and for features that OPDF does not yet cover (e.g. register-allocated variables in optimised code, .eh_frame-based stack unwinding).
OPDF and DWARF can coexist in the same binary — they use separate ELF sections. A user who needs GDB compatibility continues using -gw. A user who wants first-class Object Pascal debugging uses -gO. Over time, OPDF coverage will expand, but there is no pressure to deprecate anything.
Current limitations
- ELF64 only (Linux x86_64). Windows PE/COFF and macOS Mach-O are planned.
- Stack unwinding relies on RBP frame pointer chains, which breaks with optimised RTL compiled without frame pointers. OPDF unwind info records or .eh_frame fallback are planned.
- Float return types from injected method calls need PTRACE_GETFPREGS for XMM0 reading.
- Variant records, Variant type, and generics are not yet covered.
The full progress tracker and roadmap:
https://github.com/graemeg/opdebugger/blob/master/docs/progress.adoc
After many years of building my career on Free Pascal and Lazarus, I wanted to give something substantial back to the community beyond individual bug fixes. OPDF and PDR have become a passion project that I intend to continue developing, and I hope the FPC community finds them useful.
I welcome any feedback on the approach, the compiler integration, or the format design. Happy to answer questions.
Regards, Graeme