They used to be called e.g. `ts_language_python`. Now that there
are APIs that deal with the `TSLanguage` objects themselves, such
as `ts_language_symbol_count`, the old names were a little confusing.
There was an error in the way that we calculate the reference
scope sequences that are used as the basis for assertions about
changed ranges in randomized tests. The error caused some
characters' scopes to not be checked. This corrects the reference
implementation and fixes a previously uncaught bug in the
implementation of `tree_path_get_changed_ranges`.
Previously, when iterating over the old and new trees, we would
only perform comparisons of visible nodes. This resulted in a failure
to do any comparison for portions of the text in which there were
trailing invisible child nodes (e.g. trailing `_line_break` nodes
inside `statement` nodes in the JavaScript grammar).
Now, we additionally perform comparisons at invisible leaf nodes,
based on their lowest visible ancestor.
The parser spends the majority of its time allocating and freeing trees and stack nodes.
Also, the memory footprint of the AST is a significant concern when using tree-sitter
with large files. This library is already unlikely to work very well with source files
larger than 4GB, so representing rows, columns, byte lengths and child indices as
unsigned 32 bit integers seems like the right choice.
* While generating the lex table, note which tokens can match the
same string. A token needs to be relexed when it has possible
homonyms in the current state.
* Also note which tokens can match substrings of each other tokens.
A token needs to be relexed when there are viable tokens that
could match longer strings in the current state and the next
token has been edited.
* Remove the logic for marking tokens as fragile on creation.
* Store the reusability/non-reusability of symbols off of individual
actions and onto the entire entry for the state & symbol.
This callback-based API allows the parser to easily visit each interior node
of the stack when searching for an error repair. It also is a better abstraction
over the stack's DAG implementation than having the public functions for
accessing entries and their successor entries.