Reduce function call overhead
Up to now, the function call sequence in ECL for ordinary global functions looks as follows:
- check whether the function is defined, i.e. whether symbol->symbol.gfdef is not NULL
- set the_env->function to symbol->symbol.gfdef
- call the function pointer symbol->symbol.gfdef->cfun.entry
- the function pointer in symbol->symbol.gfdef->cfun.entry may be one of the dispatch functions from cfun_dispatch.d which performs a further indirect call to the actual entrypoint.
This merge request implements performance optimizations that allow us to a) skip the first step and b) replace the indirect call in the fourth step by a direct one. See the commit messages for details.
Also contains fixes for a few other issues I discovered while implementing this feature.