Skip to content

[Cross-platform] Constant string index node simplification

Summary

This merge request simplifies accessing a specific character of a string constant (e.g. SConst[2]) by converting the vecn node into a direct character constant (internally an ordconstn).

System

  • Processor architecture: All (cross-platform)

What is the current bug behavior?

N/A

What is the behavior after applying this patch?

Cases where direct characters are accessed in a string constant now produce simpler code, usually by removing a pointer dereference in the generated machine code.

Additional notes

No examples appear in the tests, compiler, RTL or packages, so six new tests - test/cg/tvecsimplify1.pp, test/cg/tvecsimplify1a.pp, test/cg/tvecsimplify2.pp, test/cg/tvecsimplify2a.pp, test/cg/tvecsimplify3.pp and test/cg/tvecsimplify4.pp - were added to verify correct code generation.

During initial development, there was one example of indexing a string constant in packages\gnutls\src\gnutlssockets.pp that raised internal error 2006111510 when simplified. This is because it wasn't actually accessing the character, but acquiring its effective address (a @ appeared before it) - this was fixed by simplifying the vecn node only if the nf_address_taken flag is clear - if set, it leaves the branch alone, especially as there's no telling what will be done with the pointer afterwards and the program will likely expect it to point to somewhere within the original string constant and manipulate it accordingly.

widestring and unicodestring constants are not yet simplified because the compiler does not replace the symbols with literals in the node tree, unlike with shortstring and ansistring constants. Nevertheless, test/cg/tvecsimplify3.pp and test/cg/tvecsimplify4.pp exist to verify code correctness when support is implemented in the future.

Edited by J. Gareth "Kit" Moreton

Merge request reports