Don't start with duplicate states in ts_query__analyze_patterns

This change exposes a new `primary_state_ids` field on the `TSLanguage`
struct, and populates it by tracking the first encountered state with a
given `core_id`. (For posterity: the initial change just exposed
`core_id` and deduplicated within `ts_analyze_query`).

With this `primary_state_ids` field in place, the
`ts_query__analyze_patterns` function only needs to populate its
subgraphs with starting states that are _primary_, since non-primary
states behave identically to primary ones. This leads to large savings
across the board, since most states are not primary.
This commit is contained in:
Alex Pinkus 2022-01-11 21:57:06 -08:00
parent bf210f0c9e
commit eaf9b170f1
6 changed files with 64 additions and 23 deletions

View file

@ -960,28 +960,30 @@ static bool ts_query__analyze_patterns(TSQuery *self, unsigned *error_offset) {
if (lookahead_iterator.next_state != state) {
state_predecessor_map_add(&predecessor_map, lookahead_iterator.next_state, state);
}
const TSSymbol *aliases, *aliases_end;
ts_language_aliases_for_symbol(
self->language,
lookahead_iterator.symbol,
&aliases,
&aliases_end
);
for (const TSSymbol *symbol = aliases; symbol < aliases_end; symbol++) {
array_search_sorted_by(
&subgraphs,
.symbol,
*symbol,
&subgraph_index,
&exists
if (ts_language_state_is_primary(self->language, state)) {
const TSSymbol *aliases, *aliases_end;
ts_language_aliases_for_symbol(
self->language,
lookahead_iterator.symbol,
&aliases,
&aliases_end
);
if (exists) {
AnalysisSubgraph *subgraph = &subgraphs.contents[subgraph_index];
if (
subgraph->start_states.size == 0 ||
*array_back(&subgraph->start_states) != state
)
array_push(&subgraph->start_states, state);
for (const TSSymbol *symbol = aliases; symbol < aliases_end; symbol++) {
array_search_sorted_by(
&subgraphs,
.symbol,
*symbol,
&subgraph_index,
&exists
);
if (exists) {
AnalysisSubgraph *subgraph = &subgraphs.contents[subgraph_index];
if (
subgraph->start_states.size == 0 ||
*array_back(&subgraph->start_states) != state
)
array_push(&subgraph->start_states, state);
}
}
}
}