We previously maintained a set of individual productions that were
involved in conflicts, but that was subtly incorrect because
we don't compare productions themselves when comparing parse items;
we only compare the parse items properties that could affect the
final reduce actions.
This speeds up parser generation by increasing the likelihood that we'll recognize
parse item sets as equivalent in advance, rather than having to merge their states
after the fact.
While generating the parse table, keep track of which tokens can follow one another.
Then use this information to evaluate token conflicts more precisely. This will
result in a smaller parse table than the previous, overly-conservative approach.
This fixes some randomized test failures in the C grammar, relating to Object-like macros.
The object-like macro rule relies on a whitespace token in order to distinguish object-like
macros whose values begin with a '(' from function-like macros. The presence of that
whitespace token means that other nodes should not be reusable in that state.
This fixes a bug in the C++ grammar where the `>>` token was merged into
a state where it was previously not valid, but the `>` token *was*
valid. This caused nested templates like -
std::vector<std::pair<int, int>>
to not parse correctly.
The current pretty conservative approach is to avoid merging parse states which
would cause a pair tokens to co-exist for the first time in any parse state,
where the two tokens can start with the same character and at least one of the
tokens can contain a character which is part of the grammar's separators.
* Remove remnants of templatized remove_duplicate_states function
* Rename recovery_tokens function to get_compatible_tokens and augment it
also compute pairs of tokens which could potentially be incompatible