Commit graph

1208 commits

Author SHA1 Message Date
Max Brunsfeld
82c7e170b3 Fix case where loop was created in the parse stack
Fixes #133
2018-03-02 09:05:20 -08:00
Max Brunsfeld
32ef3e001a Account for epsilon external tokens when merging parse states
Do not merge a token T into a parse state S if S contains
external tokens that can be *followed* by tokens that could
be shadowed by T.

At this point, the only automated test for this logic is via
the bash grammar, in which the `]` token should not be merged
into states in which `_concat` is valid, because `_concat`
can be followed by a `_special_characters` token, and `]`
would shadow `_special_characters`.
2018-02-28 14:47:04 -08:00
Max Brunsfeld
10a3cbd814 Move grammar schema to src folder
Now that there's a docs folder that contains actual docs.
2018-02-26 00:40:20 -08:00
Max Brunsfeld
16a45d4aa4 Fix hole in logic for terminating tree balancing
It's important that the repetition nodes have a child count of 2,
because we assign to their second child. We could maybe generalize
this to allow balancing in the presence of 'extra' nodes like comments
and errors, but this might be complicated.
2018-02-16 12:44:30 -08:00
Max Brunsfeld
2daae48fe0 Handle conflicts in repeat rules after external tokens
Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2018-02-14 11:24:51 -08:00
Max Brunsfeld
facafcd6e4 Pass row/column position to input seek method 2018-02-14 07:31:49 -08:00
Max Brunsfeld
134c455b80 Simplify logic for terminating tree balancing 2018-02-12 12:13:21 -08:00
Max Brunsfeld
299a146b66 Balance repetition trees after parsing 2018-02-12 11:41:56 -08:00
Max Brunsfeld
8c29841adf Represent repetitions with associative structure 2018-02-12 11:41:56 -08:00
Max Brunsfeld
f8704d28da Log changed ranges when reparsing 2018-01-26 15:40:07 -08:00
Max Brunsfeld
46dcd53090 Do not insert missing tokens if halt_on_error option is passed 2018-01-24 14:04:55 -08:00
Max Brunsfeld
919c9d8715 Ensure root node has a null parent 2018-01-23 17:20:15 -08:00
Max Brunsfeld
b520bdd2d5
Merge pull request #126 from tree-sitter/mb-fix-epsilon-rule-loophole
Don't allow an epsilon start rule if it is used in other rules
2018-01-23 17:19:55 -08:00
Max Brunsfeld
2e4f76c164 Don't allow an epsilon start rule if it is used in other rules 2018-01-23 17:05:28 -08:00
Max Brunsfeld
dafa897021 Bail on error recovery if too many alternative parses have already completed 2018-01-09 17:08:36 -08:00
Max Brunsfeld
315dff3285 Add an API for getting a node's child index 2018-01-09 14:01:36 -08:00
Max Brunsfeld
f653f2b3bb Add ts_node_first_{child,named_child}_for_byte methods 2018-01-09 13:44:59 -08:00
Max Brunsfeld
1e04489e50 Fix error in handling of padding in get_changed_ranges 2017-12-29 18:02:06 -08:00
Max Brunsfeld
f3c3fd3c9e Make it easier to enable/disable logging in get_changed_ranges 2017-12-29 18:01:42 -08:00
Max Brunsfeld
21a88b1731 Don't count less-far-along versions in better_version_exists method 2017-12-29 16:10:43 -08:00
Max Brunsfeld
d3c85f288d Start work on repairing errors by inserting missing tokens 2017-12-29 15:11:00 -08:00
Max Brunsfeld
f2dc620610 Extract parser__recover_to_state method 2017-12-29 15:10:59 -08:00
Max Brunsfeld
adf47e2b57 Fix invalid usage of 'extra' field on non-shift parse action 2017-12-29 11:46:41 -08:00
Max Brunsfeld
d9094e8146 Consolidate more logic into do_potential_reductions method 2017-12-28 15:49:48 -08:00
Max Brunsfeld
eee3db08d2 Avoid repeated calls to {start,end}_point in descendant_for_point_range 2017-12-27 11:55:52 -08:00
Max Brunsfeld
172cbb2d22 Fix infinite loop due to skipping empty tokens during error recovery 2017-12-27 11:18:06 -08:00
Max Brunsfeld
2625c3a96c Remove dead code in string_input.c 2017-12-27 10:34:29 -08:00
Max Brunsfeld
addeb6c4c1 Allocate and free trees using an object pool 2017-12-27 10:34:29 -08:00
Max Brunsfeld
0e69da37a5 Return a character count from the lexer's get_column method 2017-12-20 16:26:38 -08:00
Max Brunsfeld
fcff16cb86 Add get_column method to lexer 2017-12-19 17:54:15 -08:00
Max Brunsfeld
532bbeca0d Remove wrong handling of \a in a regex 2017-12-12 16:50:53 -08:00
Max Brunsfeld
fbcefe25f7 Avoid creating external tokens that start after they end 2017-12-07 11:50:27 -08:00
Max Brunsfeld
5d676de051 Remove unnecessary conditional in parser__accept 2017-12-07 11:50:27 -08:00
Max Brunsfeld
493db39363 Never move the start rule of a grammar into the lexical grammar
This preserves a useful invariant that the root node of the AST is never
a token.
2017-12-07 11:50:27 -08:00
Max Brunsfeld
48681c3f0e Initialize error start and end positions at their declarations
Fixes #113

Clang doesn't seem to be able to tell that these variables were guaranteed to
be initialized by the time they were read.
2017-10-31 10:06:44 -07:00
Max Brunsfeld
121a6a66ec Take total dynamic precedence into account in stack version sorting
Signed-off-by: Josh Vera <vera@github.com>
2017-10-09 15:51:22 -07:00
Max Brunsfeld
36c2b685b9 Always invalidate old chunk of text when parsing after an edit 2017-10-04 15:09:46 -07:00
Max Brunsfeld
9f24118b17 Include trees' dynamic precedence in debug graphs 2017-10-04 10:41:20 -07:00
Max Brunsfeld
ba607a1f84 Optimize lex state merging 2017-09-18 13:40:37 -07:00
Max Brunsfeld
b0fdc33f73 Remove 'extra' and 'structural' booleans from symbol metadata 2017-09-14 12:07:46 -07:00
Max Brunsfeld
91456d7a17 Avoid duplicate error state entries for tokens that are both internal & external 2017-09-14 10:54:13 -07:00
Max Brunsfeld
2721f72c41 Represent MAX_COST_DIFFERENCE as unsigned 2017-09-13 16:49:18 -07:00
Max Brunsfeld
c1cf8e02a7 Merge pull request #101 from tree-sitter/merge-more-lex-states
Reduce the number of states in the generated lexer function
2017-09-13 16:46:58 -07:00
Max Brunsfeld
d291af9a31 Refactor error comparisons
* Deal with mergeability outside of error comparison function
* Make `better_version_exists` function pure (don't halt other versions
as a side effect).
* Tweak error comparison logic

Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-13 16:38:15 -07:00
Max Brunsfeld
71595ffde6 Only allow one stack link with a given type containing errors 2017-09-13 10:05:31 -07:00
Max Brunsfeld
07fb3ab0e6 Abort recoveries before popping if better versions already exist 2017-09-13 09:56:51 -07:00
Max Brunsfeld
47669e6015 Avoid halting the only non-halted entry in recover 2017-09-12 16:20:06 -07:00
Max Brunsfeld
ee2906ac2e Don't merge stack versions that are halted 2017-09-12 16:19:28 -07:00
Max Brunsfeld
819235bac3 Limit the number of stack nodes that are included in a summary 2017-09-12 12:00:00 -07:00
Max Brunsfeld
99d048e016 Simplify error recovery; eliminate recovery states
The previous approach to error recovery relied on special error-recovery
states in the parse table. For each token T, there was an error recovery
state in which the parser looked for *any* token that could follow T.
Unfortunately, sometimes the set of tokens that could follow T contained
conflicts. For example, in JS, the token '}' can be followed by the
open-ended 'template_chars' token, but also by ordinary tokens like
'identifier'. So with the old algorithm, when recovering from an
unexpected '}' token, the lexer had no way to distinguish identifiers
from template_chars.

This commit drops the error recovery states. Instead, when we encounter
an unexpected token T, we recover from the error by finding a previous
state S in the stack in which T would be valid, popping all of the nodes
after S, and wrapping them in an error.

This way, the lexer is always invoked in a normal parse state, in which
it is looking for a non-conflicting set of tokens. Eliminating the error
recovery states also shrinks the lex state machine significantly.

Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-11 15:22:52 -07:00