Skip to content

Symbol table in bytecode and proper relocations

Clean Importer requested to merge 27-interface-for-clean-programmers into master

This adds a symbol table to the bytecode. The relocations are now tuples of an index of the code/data segment and an index of the symbol table. Entries in the symbol table are an index to code/data (lowest bit is set iff data) with a 0-terminated string. The string is often empty for local labels.

To discuss (and defer to later MRs):

  • How to handle exceptions? Exceptions of several kinds can occur (illegal ABC instruction; heap full; stack overflow; timeout; halt) and ideally they do not crash the host program. However, a deserialization function cannot simply yield a Maybe of the result type, since exceptions may occur further down the graph than the HNF. (this is now #31 (closed))

  • Can we reuse host descriptors? (this is moved to #27 (closed)) For instance, if a bytecode program uses lists and has the Cons and Nil descriptors for this, is it possible to make it use the native descriptors instead by updating all labels? This would mean translating the bytecode program once to match it with the host program and start running only after that. This will make descriptor translation (see below) trivial.

    If this is possible, it also means that you get applying bytecode functions on native expressions (e.g., a bytecode function [Int] -> Int on a native list) for free, because the descriptors match. If not, we need to be able to translate descriptors in both directions, and deal with the issues of non-existing descriptors (see below).

  • How to translate descriptors? (this is moved to #27 (closed)) There are three different cases:

    • The same descriptor exists in the host program. These nodes can be translated.
    • The descriptor does not exist in the host program. A new descriptor needs to be added to the data segment of the host program, because when interpretation is done the bytecode should be thrown away. Then, the nodes should be translated to these new descriptors.
    • A descriptor with the same name exists in the host program, but it is different. This may be caused by different versions of a library. This needs to be detected and then handled as the second case.

    Presumably it is practical to store a translation table somewhere and fill it as needed.

Edited by Clean Importer

Merge request reports