This drove out several fixes to the incremental parsing algorithm.
There are a few examples which still don't work and which have
been explicitly excluded, for now.
I want to re-use the stack data structure for storing the
re-usable nodes from the previous parse tree during an edit.
In this case, the stack won't conceptually start at position
zero, so the name 'right_position' doesn't make sense.
The `pos` and `size` functions for Nodes now return TSLength structs,
which contain lengths in both characters and bytes. This is important
for knowing the number of unicode characters in a Node.
This reverts commit 5cd07648fd.
The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.
Conflicts:
src/compiler/build_tables/build_lex_table.cc
src/compiler/grammar.cc
src/runtime/parser.c
When breaking down the stack in parser.c, the previous code
would not account for ubiquitous tokens. This was a problem
for a long time, but wasn't noticed until ubiquitous tokens
started being used to represent separator characters
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.
For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
Previously, if an error happened right at the beginning of an error
production, the error node would be immediately shifted onto the stack
without calling the error handling function.
The lexer doesn't know the expected symbols, so it doesn't have enough
information to construct error nodes. Now, when it encounters an invalid
character, it returns NULL and the parser builds a correct error node.