* Use GLR stack-splitting to try all numbers of tokens to
discard until a repair is found.
* Check the validity of repairs by looking at the child trees,
rather than the statically-computed 'in-progress symbols' list
This callback-based API allows the parser to easily visit each interior node
of the stack when searching for an error repair. It also is a better abstraction
over the stack's DAG implementation than having the public functions for
accessing entries and their successor entries.
Since the capacity is now included in the return value, the buffer
can be reused in the ts_parser__accept function. Also, it's just
cleaner to use Array consistently, rather than a separate buffer
and size.
Instead of child() vs concrete_child(), next_sibling() vs next_concrete_sibling(), etc,
the default is switched: child() refers to the concrete syntax tree, and named_child()
refers to the AST. Because the AST is abstract through exclusion of some nodes, the
names are clearer if the qualifier goes on the AST operations
The `pos` and `size` functions for Nodes now return TSLength structs,
which contain lengths in both characters and bytes. This is important
for knowing the number of unicode characters in a Node.
This reverts commit 5cd07648fd.
The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.
Conflicts:
src/compiler/build_tables/build_lex_table.cc
src/compiler/grammar.cc
src/runtime/parser.c
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.
For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
Generated parsers no longer export a parser constructor function.
They now export an opaque Language object which can be set on
Documents directly. This way, the logic for constructing parsers
lives entirely in the runtime. The Languages are just structs which
have no load-time dependency on the runtime