Commit graph

39 commits

Author SHA1 Message Date
Max Brunsfeld
741fb3c5a1 Fix test now that JSON grammar has slightly changed 2018-12-01 21:26:34 -08:00
Max Brunsfeld
10ab7032a6 Fix incorrect node reuse for edits right at EOF 2018-11-11 21:36:31 -08:00
Max Brunsfeld
afeee894dc Fix handling of syntax changes in ranges that were excluded but are now included
Refs atom/atom#18342
2018-11-08 12:16:40 -08:00
Max Brunsfeld
0e3d9c2c58 Handle changes in included ranges when parsing incrementally 2018-11-07 15:10:24 -08:00
Max Brunsfeld
b29d0f622f Cram terminal subtree data into a 64-bit integer when possible 2018-09-17 18:52:34 -07:00
Max Brunsfeld
508499bab1 Fix bug where missing token was inserted outside of any included range 2018-09-11 17:41:23 -07:00
Max Brunsfeld
acc937b7d7 Handle input chunks that end within multi-byte characters 2018-08-02 15:43:30 -07:00
Max Brunsfeld
714fda917a Update test now that JS strings are parsed differently 2018-07-31 11:50:09 -07:00
Max Brunsfeld
87c992a7f0 Add lexer API for detecting boundaries of included ranges
Co-Authored-By: Ashi Krishnan <queerviolet@github.com>
2018-07-17 13:58:26 -07:00
Max Brunsfeld
0f0adfb681 Avoid recursion in ts_subtree_edit
This prevents stack overflows when editing very large trees.

Co-Authored-By: Ashi Krishnan <queerviolet@github.com>
2018-07-12 13:53:31 -07:00
Max Brunsfeld
83f88164aa Fix end positions of tokens at the end of included ranges
Co-Authored-By: Ashi Krishnan <queerviolet@github.com>
2018-07-09 10:23:25 -07:00
Max Brunsfeld
80cab8fd8a Make the empty chunk 2 bytes long, for UTF16 support 2018-06-25 17:46:23 -07:00
Max Brunsfeld
a6451f9b4f Add ts_parser_set_include_ranges function
Co-Authored-By: Ashi Krishnan <queerviolet@github.com>
2018-06-20 13:37:43 -07:00
Max Brunsfeld
d7c1f84d7b Remove resume method, make parse resume by default
Also, add a `reset` method to explicitly discard an outstanding parse.

Co-Authored-By: Ashi Krishnan <queerviolet@github.com>
2018-06-19 15:33:29 -07:00
Max Brunsfeld
b0b3b2e5f3 Consolidate TSInput interface down to one function 2018-06-19 09:34:40 -07:00
Max Brunsfeld
69d8c6f5e6 Check that language is present in both parse() and resume() 2018-05-23 15:41:16 -07:00
Max Brunsfeld
e16f0338d6 Add APIs for pausing a parse after N operations and resuming later 2018-05-23 15:02:39 -07:00
Max Brunsfeld
1fece241aa Add ts_parser_set_enabled API 2018-05-21 17:28:12 -07:00
Max Brunsfeld
199a94cc26 Allow the parser to print dot graphs to any file 2018-05-11 12:48:51 -07:00
Max Brunsfeld
e75ecd1bb1 Rework API completely 2018-05-11 10:46:13 -07:00
Max Brunsfeld
92255bbfdd Remove document parameter from ts_node_type, ts_node_string
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-09 15:28:28 -07:00
Max Brunsfeld
b06747b6ca Remove stale unit tests
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-09 14:14:42 -07:00
Max Brunsfeld
d5cfc06fa2 Fix unit test for invalid utf8 at EOF 2018-04-17 17:33:45 -07:00
Max Brunsfeld
e927d02f43 Allow reusing leaf nodes unless the next leaf has changes 2018-03-07 17:44:54 -08:00
Max Brunsfeld
52087de4f0 Remove the concept of fragile reductions
They were a vestige of when Tree-sitter did sentential form-based
incremental parsing (as opposed to simply state matching). This was
elegant but not compatible with GLR as far as I could tell.
2018-03-02 14:51:54 -08:00
Max Brunsfeld
0e69da37a5 Return a character count from the lexer's get_column method 2017-12-20 16:26:38 -08:00
Max Brunsfeld
d291af9a31 Refactor error comparisons
* Deal with mergeability outside of error comparison function
* Make `better_version_exists` function pure (don't halt other versions
as a side effect).
* Tweak error comparison logic

Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-13 16:38:15 -07:00
Max Brunsfeld
99d048e016 Simplify error recovery; eliminate recovery states
The previous approach to error recovery relied on special error-recovery
states in the parse table. For each token T, there was an error recovery
state in which the parser looked for *any* token that could follow T.
Unfortunately, sometimes the set of tokens that could follow T contained
conflicts. For example, in JS, the token '}' can be followed by the
open-ended 'template_chars' token, but also by ordinary tokens like
'identifier'. So with the old algorithm, when recovering from an
unexpected '}' token, the lexer had no way to distinguish identifiers
from template_chars.

This commit drops the error recovery states. Instead, when we encounter
an unexpected token T, we recover from the error by finding a previous
state S in the stack in which T would be valid, popping all of the nodes
after S, and wrapping them in an error.

This way, the lexer is always invoked in a normal parse state, in which
it is looking for a non-conflicting set of tokens. Eliminating the error
recovery states also shrinks the lex state machine significantly.

Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-11 15:22:52 -07:00
Max Brunsfeld
94dc703bfc Require that grammars' start rules be visible 2017-08-04 17:07:37 -07:00
Max Brunsfeld
e5c3bf742d Update fixture grammars 2017-08-03 16:32:39 -07:00
Max Brunsfeld
cbdfd89675 Mark reductions as fragile based on their final properties
We previously maintained a set of individual productions that were
involved in conflicts, but that was subtly incorrect because
we don't compare productions themselves when comparing parse items;
we only compare the parse items properties that could affect the
final reduce actions.
2017-07-21 09:54:24 -07:00
Max Brunsfeld
8f028ebf68 Avoid deep tree comparison when both trees have errors 2017-07-05 17:33:35 -07:00
Max Brunsfeld
f62ee5a0f3 Fix OOB reads at ends of chunks
Signed-off-by: Philip Turnbull <philipturnbull@github.com>
2017-06-23 12:09:16 -07:00
Max Brunsfeld
a15e974150 Make clearer assertions about SpyInput's read strings 2017-03-21 12:14:04 -07:00
Max Brunsfeld
ca943f09a4 Update expected trees in error recovery test 2017-03-21 11:41:01 -07:00
Max Brunsfeld
f032da198e Finish test for invalid UTF8 handling
Signed-off-by: Tim Clem <timothy.clem@gmail.com>
2017-03-21 11:05:32 -07:00
Timothy Clem
7092d4522a Test demonstrating non-UT8 input failure 2017-03-21 09:58:35 -07:00
Max Brunsfeld
d222dbb9fd Allow lexer to accept tokens that ended at previous positions
* Track lookahead in each tree
* Add 'mark_end' API that external scanners can use
2017-03-13 17:06:52 -07:00
Max Brunsfeld
6dc0ff359d Rename spec -> test
'Test' is a lot more straightforward of a name.
2017-03-09 20:40:01 -08:00
Renamed from spec/runtime/parser_spec.cc (Browse further)