Use the TI-Toolkit tokensheet
TI-Toolkit has a well-maintained and comprehensive tokensheet that derives from the TokenIDE sheet that is currently used here by titokens
. I suggested in https://github.com/TI-Toolkit/tokens/issues/12 that it also include fields that indicate which calculator charset codepoints represent each token, and since that's done some of the data that's currently hard-coded here can be changed to come from the tokens XML.
The existing XML file here probably can't be completely replaced, because it has aliases to legacy token representations that are used by existing programs written as plaintext. Those aliases could perhaps be moved to a separate file that relates the canonical token string (or bytes) to aliases, which could also potentially be contributed back to TI-toolkit.