Changes

Tristano Ajmone · 5c38b511
--- a/LangDefs-Debugging.md
+++ b/LangDefs-Debugging.md
@@ -57,7 +57,9 @@ A python example, without the State IDs plugin:

 The plugin adds, at each parser state change, the integer of the new state enclosed in parenthesis. This reveals us interesting details about the parser's inner workings; for example, from the above screenshot we can notice that during the string parsing the parser updates the syntax multiple times, even though the same state is confirmed. This shows us that the parser is consuming the string in chunks, trying to isolate any tokens that could match legitimate sub-string elements (escape sequences, interpolations).

-As for the actual numbers, these represent the various possible parser states, which are assigned at initialization time, and might vary with each syntax (depending on what elements are actually defined). The correspondence between parser states and integer values can be retrived via the `--verbose` option on the command line (actual output cut-down here, for space reasons):
+As for the actual numbers, these represent the various possible parser states, which are assigned at initialization time, and might vary with each syntax (depending on what elements are actually defined). The correspondence between parser states and integer values can be retrived via the `--verbose` option.
+
+Let's try it with the example file used with HighlightGUI in our screenshots. From the command line we'll invoke `highlight --verbose StatesIDs-plugin-Example.py` and try to pinpoint in the output which states correspond to "`11`", "`9`" and "`1`" (actual output cut-down here, for space reasons):

 ```
 > highlight --verbose StatesIDs-plugin-Example.py
@@ -79,12 +81,39 @@ HL_OPERATOR: number [ 9 ]                   <--  (9) = Operators
 HL_OPERATOR_END: number [ 18 ]
 ...
 HL_STANDARD: number [ 0 ]
-HL_STRING: number [ 1 ]                     <--  (1) = String
+HL_STRING: number [ 1 ]                     <--  (1) = Strings
 HL_STRING_END: number [ 12 ]
 HL_UNKNOWN: number [ 100 ]
 ...
 ```

+I've added arrows on the right side, pointing to the values we were seeking for. Now we know what these numbers means in term of the parser states:
+
+- `1` is `Strings`
+- `9` is `Operators`
+- `11` is `Keywords`
+
+Let's analyze the plugin output:
+
+![state-IDs plugin ON][state-IDs ON]
+
+We can now get a clear picture of how HL parser is tokenizing the "`print("Hello!")` line, step by step:
+
+
+| token          | state ID        | parser state      |
+| :------------- | ---------:      | :---------------- |
+| “`print`”      | (__11__)        | Keyword token     |
+| “`(`”          | (__9__)         | Operator token    |
+| “`"`”          | (__1__)         | String token      |
+| “`Hello`”      | (__1__) (__1__) | String token      |
+| “`!`”          | (__1__) (__1__) | String token      |
+| “`"`”          | (__1__)         | String token      |
+| “`)`”          | (__9__)         | Operator token    |
+
+You'll' also notice that `syntaxUpdate()` is being called twice for tokens inside the string (ie, for “`Hello`”  and “`!`”). This means that for the current syntax definition the parser needs to undergo two state updates for evaluating those tokens — basically, one update to establish they are not sub-elements (eg: an escape sequence), and another to establish that the string state needs to carry on.
+
+In complex language definitions, the parser might go through multiple updates to evaluate each token, depending on the token's context and the definitions provided by the syntax, but especially if there are custom rules hooked into `OnStateChange()` that force it to return with custom values (eg: `HL_REJECT`, `HL_STANDARD`, etc.).
+