For now, only reduce/reduce conflicts w/ no tie-breaking precedence
are treated as errors. The rest are dropped, because shift/reduce
conflicts are currently very common because we don't have a way
of specifying associativity along w/ precedence.
The `pos` and `size` functions for Nodes now return TSLength structs,
which contain lengths in both characters and bytes. This is important
for knowing the number of unicode characters in a Node.
This reverts commit 5cd07648fd.
The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.
Conflicts:
src/compiler/build_tables/build_lex_table.cc
src/compiler/grammar.cc
src/runtime/parser.c
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.
For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
The lexer doesn't know the expected symbols, so it doesn't have enough
information to construct error nodes. Now, when it encounters an invalid
character, it returns NULL and the parser builds a correct error node.
Now, the root node of a document is always a document node.
It will often have only one child node which corresponds to the grammar's
start symbol, but not always. Currently, it may have more than one child
if there are ubiquitous tokens such as comments at the beginning of the
document. In the future, it will also be possible be possible to have multiple
for the document to have multiple children if the document is partially parsed.
There's no need for a `string` function since one already
exists for Nodes.
Now the root node is always stored on the document. This
means callers of `ts_document_root_node` don't need to release
its return value.