Speed up variable lookup
The following program takes ~3 seconds in the simulator, on a fast Intel i9 box:
LBL ‘FillScr’
239
STO ‘sy’
LBL j
399
STO ‘sx’
LBL i
RCL ‘sy’
RCL ‘sx’
PIXEL
DSL ‘sx’
GTO i
DSL ‘sy’
GTO j
RTN
END
As you see, all it does is fill the screen with pixels, something that should be possible to do in a small fraction of a second.
I realize that program execution on the R47 is interpreted, but we can always hope for less overhead, yes?
On profiling it here, I found that the single biggest hotspot is the call to findNamedVariable(), and nearly ⅔ of that is the inner call to _findReservedVariable().
I believe what we're seeing is the run-time cost of comparing these complicated internal string types. Because these comparisons occur for every pixel, there are roughly 97000 lookups even though the looked-up register numbers never change once the program starts.
Caching those lookups would help. You can prove this to yourself by replacing all named variables with register numbers. That instantly makes this test program run in a fraction of a second.
Even better would be to switch to Unicode, because then you could use the strcmp() CPU intrinsic present in all modern processors for UTF-8 comparisons; lightning fast. I realize this is asking an awful lot. From this very example, we see how deep the roots of the custom string type go. It won't be easy to rip it all out, much replace it.