Commit 47912857 authored by Jamie A. Jennings's avatar Jamie A. Jennings

Notes for future on "linking" RPL code with libraries

parent 55c01759
Friday, November 9, 2018
While adding the new instruction XCall, and specifically while adding code
generation for the TXCall tree type, we observe that the code generator that we
inherited from `lpcode.c` employs several optimizations. Some of these are lost
when the expression (tree) being compiled has type TXCall, because it refers
opaquely to a pattern whose source code we don't have (in general). That is,
the pattern being called may be stored in a pre-compiled library.
After XCall is implemented and working, there are two techniques for re-enabling
these optimizations, and we should consider doing both of them:
(1) When we do in fact have the source code (a tree) for the pattern being
called, we can do all the optimizations. That is, assuming the code for the
pattern being called cannot change later. This assumption is valid when we are
calling a pattern whose source will be compiled "at the same time" as the ones
that call it.
(a) If A and B both call C, and we produce a code vector that contains the
code for A, B, and C, then we satisfy our assumption.
(b) In an interactive scenario like the repl, we must recompile A and B when C
changes, because a change to C can invalidate the optimizations made around
the call sites to C in A and B.
(2) When we do NOT have the source code for the called pattern, C, the compiled
code (instruction vector) for C must have come from a pre-compiled library. We
could enable optimizations while compiling A and B (which call C) if we store
some key properties of C along with its instruction vector when C is compiled.
This approach will require us to invalidate the compiled code for A and B when C
changes, so we will have to detect changes to C at "link time". If we only link
statically, then this scenario is moot. If, however, we allow dynamic linking
(at run-time) with pre-compiled libraries, then we must detect updates to C
compared to the versions required by A and B.
N.B. None of these issues are novel, of course. There are reasonably good
known solutions for (1) in REPL-based languages like Lisp and Scheme, for
example. And (2) is a perennial problem for load-time linking of dynamic shared
objects.
If we decide to support dynamic linking, and then we further wish to enable
optimizations during code generation for XCall, then we should consider the
following properties of a pattern, C, that we might call:
hascaptures(C) Does C have any captures?
nullable(C) Can C succeed without consuming any input?
nofail(C) Will C never fail for any input, even the empty string?
fixedlen(C) -1 or the length of C in bytes when C is fixed length.
firstset(C) First set for C (see below).
headfail(C) True when C can fail depending on next byte of input.
The "first set" of a pattern is a set of chars (bytes) such that one of them
must match the next byte of the input, else the pattern fails.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment