Due to an oversight in #1589, I added `primary_field_ids` into the
`TSLanguage` struct in a place that wasn't the end. This is not actually
backwards compatible and causes downstream failures :(
This change exposes a new `primary_state_ids` field on the `TSLanguage`
struct, and populates it by tracking the first encountered state with a
given `core_id`. (For posterity: the initial change just exposed
`core_id` and deduplicated within `ts_analyze_query`).
With this `primary_state_ids` field in place, the
`ts_query__analyze_patterns` function only needs to populate its
subgraphs with starting states that are _primary_, since non-primary
states behave identically to primary ones. This leads to large savings
across the board, since most states are not primary.
Some languages have the notion of modules, and to represent those
we've started to use a `@module` tag, as discussed in
https://github.com/elixir-lang/tree-sitter-elixir/issues/15.
Because historically we've used the constructor highlight color for
modules in JS/Ruby, it's defined to map to the same color.
* Fix bugs related to named wildcard patterns vs regular wildcard patterns.
* Fix handling of extra nodes during query analysis. Previously, the
expected child_index was updated incorrectly after an extra node,
leading to false "impossible pattern" errors.
* Refine logic for avoiding unnecessary state-splitting due to fallible steps.
Compute *two* different analysis results related to step fallibility:
* `root_pattern_guaranteed` which, like before, summarizes whether the
entire pattern is guaranteed to match once this step is reached.
* `parent_pattern_guaranteed` - which just indicates whether the
immediate parent pattern is guaranteed. This is now used when
deciding whether it's necessary to split a match state.
These tests are easier to write and maintain if the grammars are just JS,
like grammars normally are. It doesn't slow the tests down significantly
to shell out to `node` for each of these grammars.