This change exposes a new `primary_state_ids` field on the `TSLanguage`
struct, and populates it by tracking the first encountered state with a
given `core_id`. (For posterity: the initial change just exposed
`core_id` and deduplicated within `ts_analyze_query`).
With this `primary_state_ids` field in place, the
`ts_query__analyze_patterns` function only needs to populate its
subgraphs with starting states that are _primary_, since non-primary
states behave identically to primary ones. This leads to large savings
across the board, since most states are not primary.
- Use a proper enum type for quantifiers.
- Drop quantifiers from `TSQueryStep`, which was not used.
- Keep track of the captures introduced during a pattern parse, and
apply the quantifier for the pattern to the captures that were
introduced by the pattern or any sub patterns.
- Use 'quantifier' instead of 'suffix'.
The default is now a whopping 64K matches, which "should be enough for
everyone". You can use the new `ts_query_cursor_set_match_limit`
function to set this to a lower limit, such as the previous default of
32.
This function (and the similar `ts_tree_cursor_goto_first_child_for_byte`)
allows you to efficiently seek the tree cursor to a given position,
exploiting the tree's internal balancing, without having to visit
all of the preceding siblings of each node.
Also remove the use of bitfields from the parse table format.
In all cases, bitfields were not necessary to achieve the
current binary sizes. Avoiding them makes the binaries more
portable.
There was no way to make this change backward-compatible,
so we have finally dropped support for parsers generated
with an earlier version of Tree-sitter.
At some point, when Atom adopts this version of Tree-sitter,
this change will affect Atom users who have installed packages
using third-party Tree-sitter parsers. The packages will need
to be updated to use a regenerated version of the parsers.