Max Brunsfeld
0ec7e5ce42
Remove ts_stack_force_merge function
2018-04-06 13:26:18 -07:00
Max Brunsfeld
80f856cef5
Maintain a total node count on every tree
...
This simplifies (and fixes bugs in) the parse stack's tracking of its
total node count since the last error, which is needed for error
recovery.
2018-04-06 13:26:18 -07:00
Max Brunsfeld
e59558c83b
Allow stack versions to be temporarily paused
...
This way, when detecting an error, we can defer the decision about
whether to bail or recover until all stack versions are processed.
2018-04-06 13:26:18 -07:00
Max Brunsfeld
5520983144
Clean up Stack API
...
* Remove StackPopResult
* Rename top_state() -> state()
* Rename top_position() -> position()
* Improve docs
2018-03-29 17:37:54 -07:00
Max Brunsfeld
ee995c3d6b
Avoid redundant retains/releases by giving ts_stack_push move semantics
2018-03-29 17:18:43 -07:00
Max Brunsfeld
e917756ad1
Remove depends_on_lookahead field from parse table entries
...
This simplifies the logic for determining whether a token is reusable
and makes it more conservative. It should fix some incremental parsing
bugs that are being caught by the randomized tests on CI.
2018-03-28 10:58:33 -07:00
Max Brunsfeld
e927d02f43
Allow reusing leaf nodes unless the next leaf has changes
2018-03-07 17:44:54 -08:00
Max Brunsfeld
c0cc35ff07
Create separate lexer function for keywords
2018-03-07 12:00:26 -08:00
Max Brunsfeld
f96969738b
Don't remove mergeable stack versions so aggressively during condense
2018-03-05 10:40:05 -08:00
Max Brunsfeld
dbc0c208f4
Add missing initialization of parser's in_ambiguity state
2018-03-02 15:25:39 -08:00
Max Brunsfeld
52087de4f0
Remove the concept of fragile reductions
...
They were a vestige of when Tree-sitter did sentential form-based
incremental parsing (as opposed to simply state matching). This was
elegant but not compatible with GLR as far as I could tell.
2018-03-02 14:51:54 -08:00
Max Brunsfeld
299a146b66
Balance repetition trees after parsing
2018-02-12 11:41:56 -08:00
Max Brunsfeld
8c29841adf
Represent repetitions with associative structure
2018-02-12 11:41:56 -08:00
Max Brunsfeld
46dcd53090
Do not insert missing tokens if halt_on_error option is passed
2018-01-24 14:04:55 -08:00
Max Brunsfeld
dafa897021
Bail on error recovery if too many alternative parses have already completed
2018-01-09 17:08:36 -08:00
Max Brunsfeld
21a88b1731
Don't count less-far-along versions in better_version_exists method
2017-12-29 16:10:43 -08:00
Max Brunsfeld
d3c85f288d
Start work on repairing errors by inserting missing tokens
2017-12-29 15:11:00 -08:00
Max Brunsfeld
f2dc620610
Extract parser__recover_to_state method
2017-12-29 15:10:59 -08:00
Max Brunsfeld
adf47e2b57
Fix invalid usage of 'extra' field on non-shift parse action
2017-12-29 11:46:41 -08:00
Max Brunsfeld
d9094e8146
Consolidate more logic into do_potential_reductions method
2017-12-28 15:49:48 -08:00
Max Brunsfeld
172cbb2d22
Fix infinite loop due to skipping empty tokens during error recovery
2017-12-27 11:18:06 -08:00
Max Brunsfeld
addeb6c4c1
Allocate and free trees using an object pool
2017-12-27 10:34:29 -08:00
Max Brunsfeld
0e69da37a5
Return a character count from the lexer's get_column method
2017-12-20 16:26:38 -08:00
Max Brunsfeld
fbcefe25f7
Avoid creating external tokens that start after they end
2017-12-07 11:50:27 -08:00
Max Brunsfeld
5d676de051
Remove unnecessary conditional in parser__accept
2017-12-07 11:50:27 -08:00
Max Brunsfeld
48681c3f0e
Initialize error start and end positions at their declarations
...
Fixes #113
Clang doesn't seem to be able to tell that these variables were guaranteed to
be initialized by the time they were read.
2017-10-31 10:06:44 -07:00
Max Brunsfeld
121a6a66ec
Take total dynamic precedence into account in stack version sorting
...
Signed-off-by: Josh Vera <vera@github.com>
2017-10-09 15:51:22 -07:00
Max Brunsfeld
36c2b685b9
Always invalidate old chunk of text when parsing after an edit
2017-10-04 15:09:46 -07:00
Max Brunsfeld
b0fdc33f73
Remove 'extra' and 'structural' booleans from symbol metadata
2017-09-14 12:07:46 -07:00
Max Brunsfeld
91456d7a17
Avoid duplicate error state entries for tokens that are both internal & external
2017-09-14 10:54:13 -07:00
Max Brunsfeld
2721f72c41
Represent MAX_COST_DIFFERENCE as unsigned
2017-09-13 16:49:18 -07:00
Max Brunsfeld
c1cf8e02a7
Merge pull request #101 from tree-sitter/merge-more-lex-states
...
Reduce the number of states in the generated lexer function
2017-09-13 16:46:58 -07:00
Max Brunsfeld
d291af9a31
Refactor error comparisons
...
* Deal with mergeability outside of error comparison function
* Make `better_version_exists` function pure (don't halt other versions
as a side effect).
* Tweak error comparison logic
Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-13 16:38:15 -07:00
Max Brunsfeld
07fb3ab0e6
Abort recoveries before popping if better versions already exist
2017-09-13 09:56:51 -07:00
Max Brunsfeld
47669e6015
Avoid halting the only non-halted entry in recover
2017-09-12 16:20:06 -07:00
Max Brunsfeld
819235bac3
Limit the number of stack nodes that are included in a summary
2017-09-12 12:00:00 -07:00
Max Brunsfeld
99d048e016
Simplify error recovery; eliminate recovery states
...
The previous approach to error recovery relied on special error-recovery
states in the parse table. For each token T, there was an error recovery
state in which the parser looked for *any* token that could follow T.
Unfortunately, sometimes the set of tokens that could follow T contained
conflicts. For example, in JS, the token '}' can be followed by the
open-ended 'template_chars' token, but also by ordinary tokens like
'identifier'. So with the old algorithm, when recovering from an
unexpected '}' token, the lexer had no way to distinguish identifiers
from template_chars.
This commit drops the error recovery states. Instead, when we encounter
an unexpected token T, we recover from the error by finding a previous
state S in the stack in which T would be valid, popping all of the nodes
after S, and wrapping them in an error.
This way, the lexer is always invoked in a normal parse state, in which
it is looking for a non-conflicting set of tokens. Eliminating the error
recovery states also shrinks the lex state machine significantly.
Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-11 15:22:52 -07:00
Max Brunsfeld
4c9c05806a
Merge compatible starting token states before constructing lex table
2017-09-05 13:21:53 -07:00
Max Brunsfeld
ac9d260734
Clean up parser fields
2017-08-31 12:50:10 -07:00
Max Brunsfeld
4a0587061e
Consolidate logic for deciding on a lookahead node
2017-08-31 12:19:37 -07:00
Max Brunsfeld
41074cbf2d
🎨
2017-08-30 16:48:15 -07:00
Max Brunsfeld
fdc6ee445b
Remove parser__push helper function
2017-08-30 16:41:07 -07:00
Max Brunsfeld
1b1276bdbf
Simplify parser__condense_stack function
2017-08-30 16:36:02 -07:00
Max Brunsfeld
96a630e5df
Clean up check for leaf node reusability
2017-08-30 16:19:51 -07:00
Max Brunsfeld
8bdab7335e
Remove unnecessary reusability check after breaking down lookahead
2017-08-30 16:19:11 -07:00
Max Brunsfeld
bef536a7d0
Discard fragile reusable nodes earlier
2017-08-30 16:17:10 -07:00
Max Brunsfeld
5cbd50c7d7
Remember how far ahead the lexer looked on failed calls
...
This needs to be included in the 'bytes_scanned' property of the token
that is ultimately produced.
2017-08-29 15:04:22 -07:00
Max Brunsfeld
f3977ec213
Always call deserialize on external scanner before scanning
...
Remembering the last token that the external scanner produced is
not worth the complexity.
2017-08-29 14:41:55 -07:00
Max Brunsfeld
4d63e26e9e
Clean up logic for falling back to error mode after lexing fails
2017-08-25 16:57:09 -07:00
Max Brunsfeld
86d5737fc2
Escape quotes when printing symbols to dot graphs
2017-08-25 16:26:40 -07:00