For some reason, there was previously some extra logic that prevented
the external scanner from being invoked if the only valid external
token also had an internal definition.
It's surprising to not call the external scanner if an external
token is valid.
The current pretty conservative approach is to avoid merging parse states which
would cause a pair tokens to co-exist for the first time in any parse state,
where the two tokens can start with the same character and at least one of the
tokens can contain a character which is part of the grammar's separators.
They used to be called e.g. `ts_language_python`. Now that there
are APIs that deal with the `TSLanguage` objects themselves, such
as `ts_language_symbol_count`, the old names were a little confusing.
Tokens that are defined in the grammar's rules may now be included in the
externals list also, so that external scanners can check if they are valid
lookaheads or not, and if so, can return them to the parser if needed.
* While generating the lex table, note which tokens can match the
same string. A token needs to be relexed when it has possible
homonyms in the current state.
* Also note which tokens can match substrings of each other tokens.
A token needs to be relexed when there are viable tokens that
could match longer strings in the current state and the next
token has been edited.
* Remove the logic for marking tokens as fragile on creation.
* Store the reusability/non-reusability of symbols off of individual
actions and onto the entire entry for the state & symbol.
Before, any syntax error would cause the lexer to create an error
leaf node. This could happen even with a valid input, if the parse
stack had split and one particular version of the parse stack
failed to parse.
Now, an error leaf node is only created when the lexer cannot understand
part of the input stream at all. When a normal syntax error occurs,
the lexer just returns a token that is outside of the expected token
set, and the parser handles the unexpected token.
* Use GLR stack-splitting to try all numbers of tokens to
discard until a repair is found.
* Check the validity of repairs by looking at the child trees,
rather than the statically-computed 'in-progress symbols' list