Commit graph

77 commits

Author SHA1 Message Date
Max Brunsfeld
1fece241aa Add ts_parser_set_enabled API 2018-05-21 17:28:12 -07:00
Max Brunsfeld
78f28b14ce Remove unused field 2018-05-18 14:27:52 -07:00
Max Brunsfeld
074c051094 Change the TSInputEdit struct to work with old/new start and end positions 2018-05-17 11:14:51 -07:00
Max Brunsfeld
e3670be42f Avoid one heap allocation when instantiating a TSTreeCursor 2018-05-16 16:05:08 -07:00
Max Brunsfeld
6fc8d9871c Hide the details of TSNode's fields in the public API 2018-05-16 15:44:04 -07:00
Max Brunsfeld
ebddb1a0b5 Add ts_tree_cursor_goto_first_child_for_byte method
Atom needs this for efficiently seeking to the leaf node at a given position,
visiting all of its ancestors along the way.
2018-05-16 13:51:21 -07:00
Max Brunsfeld
32c06b9b59 Make multi-threaded test work on windows
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-11 17:10:05 -07:00
Max Brunsfeld
043a2fc0d9 Assert absence of memory leaks in randomized multi-threaded tree test 2018-05-11 16:53:47 -07:00
Max Brunsfeld
a3e08e7c31 Add randomized multi-threaded tests on parse trees
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-11 16:10:36 -07:00
Max Brunsfeld
fe53506175 Declare subtrees as const wherever possible
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-11 15:06:13 -07:00
Max Brunsfeld
20c183b7cd Rename ts_subtree_make_* -> ts_subtree_new_* 2018-05-11 13:02:12 -07:00
Max Brunsfeld
bf1bb1604f Rename TSExternalTokenState -> ExternalScannerState 2018-05-11 12:57:41 -07:00
Max Brunsfeld
199a94cc26 Allow the parser to print dot graphs to any file 2018-05-11 12:48:51 -07:00
Max Brunsfeld
e75ecd1bb1 Rework API completely 2018-05-11 10:46:13 -07:00
Max Brunsfeld
35510a612d Rename Tree -> Subtree 2018-05-10 15:11:14 -07:00
Max Brunsfeld
61327b627a Add a unit test asserting that ts_tree_edit doesn't mutate the tree
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-10 12:28:16 -07:00
Max Brunsfeld
09e663c7d1 Make ts_tree_edit return a new tree rather than mutating its argument
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-10 12:23:05 -07:00
Max Brunsfeld
df79ff5997 Refactor ts_tree_edit
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-10 12:04:18 -07:00
Max Brunsfeld
666dfb76d2 Remove document parameter from ts_node_type, ts_node_string
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-09 16:47:47 -07:00
Max Brunsfeld
92255bbfdd Remove document parameter from ts_node_type, ts_node_string
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-09 15:28:28 -07:00
Max Brunsfeld
b06747b6ca Remove stale unit tests
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-09 14:14:42 -07:00
Max Brunsfeld
973e4a44f0 Start work on removing parent pointers
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-09 12:22:19 -07:00
Max Brunsfeld
d5cfc06fa2 Fix unit test for invalid utf8 at EOF 2018-04-17 17:33:45 -07:00
Max Brunsfeld
09be0b6ef5 Store trees' children in TreeArrays, not w/ separate pointer and length 2018-04-06 13:26:18 -07:00
Max Brunsfeld
a6cf2e87e7 Fix halt_on_error tests 2018-04-06 13:26:18 -07:00
Max Brunsfeld
dbe77e7199 Simplify testing-only ts_stack_iterate function 2018-03-29 17:50:07 -07:00
Max Brunsfeld
5520983144 Clean up Stack API
* Remove StackPopResult
* Rename top_state() -> state()
* Rename top_position() -> position()
* Improve docs
2018-03-29 17:37:54 -07:00
Max Brunsfeld
ee995c3d6b Avoid redundant retains/releases by giving ts_stack_push move semantics 2018-03-29 17:18:43 -07:00
Max Brunsfeld
0810971f3e 🔥 symbol iterator API
This idea was never fully baked.
2018-03-08 14:16:37 -08:00
Max Brunsfeld
e927d02f43 Allow reusing leaf nodes unless the next leaf has changes 2018-03-07 17:44:54 -08:00
Max Brunsfeld
52087de4f0 Remove the concept of fragile reductions
They were a vestige of when Tree-sitter did sentential form-based
incremental parsing (as opposed to simply state matching). This was
elegant but not compatible with GLR as far as I could tell.
2018-03-02 14:51:54 -08:00
Max Brunsfeld
82c7e170b3 Fix case where loop was created in the parse stack
Fixes #133
2018-03-02 09:05:20 -08:00
Max Brunsfeld
46dcd53090 Do not insert missing tokens if halt_on_error option is passed 2018-01-24 14:04:55 -08:00
Max Brunsfeld
315dff3285 Add an API for getting a node's child index 2018-01-09 14:01:36 -08:00
Max Brunsfeld
f653f2b3bb Add ts_node_first_{child,named_child}_for_byte methods 2018-01-09 13:44:59 -08:00
Max Brunsfeld
addeb6c4c1 Allocate and free trees using an object pool 2017-12-27 10:34:29 -08:00
Max Brunsfeld
0e69da37a5 Return a character count from the lexer's get_column method 2017-12-20 16:26:38 -08:00
Max Brunsfeld
36c2b685b9 Always invalidate old chunk of text when parsing after an edit 2017-10-04 15:09:46 -07:00
Max Brunsfeld
d291af9a31 Refactor error comparisons
* Deal with mergeability outside of error comparison function
* Make `better_version_exists` function pure (don't halt other versions
as a side effect).
* Tweak error comparison logic

Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-13 16:38:15 -07:00
Max Brunsfeld
99d048e016 Simplify error recovery; eliminate recovery states
The previous approach to error recovery relied on special error-recovery
states in the parse table. For each token T, there was an error recovery
state in which the parser looked for *any* token that could follow T.
Unfortunately, sometimes the set of tokens that could follow T contained
conflicts. For example, in JS, the token '}' can be followed by the
open-ended 'template_chars' token, but also by ordinary tokens like
'identifier'. So with the old algorithm, when recovering from an
unexpected '}' token, the lexer had no way to distinguish identifiers
from template_chars.

This commit drops the error recovery states. Instead, when we encounter
an unexpected token T, we recover from the error by finding a previous
state S in the stack in which T would be valid, popping all of the nodes
after S, and wrapping them in an error.

This way, the lexer is always invoked in a normal parse state, in which
it is looking for a non-conflicting set of tokens. Eliminating the error
recovery states also shrinks the lex state machine significantly.

Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-11 15:22:52 -07:00
Max Brunsfeld
f6325746aa Provide symbol metadata with dummy language in stack test 2017-08-08 17:47:24 -07:00
Max Brunsfeld
cc7277fd7d Avoid using IsNull bandit assertion 2017-08-08 12:52:35 -07:00
Max Brunsfeld
94dc703bfc Require that grammars' start rules be visible 2017-08-04 17:07:37 -07:00
Max Brunsfeld
e5c3bf742d Update fixture grammars 2017-08-03 16:32:39 -07:00
Max Brunsfeld
09f4796f6b Get tests passing w/ new alias API 2017-08-01 14:35:34 -07:00
Max Brunsfeld
cb5fe80348 Rename RENAME rule to ALIAS, allow it to create anonymous nodes 2017-07-31 16:41:11 -07:00
Max Brunsfeld
cbdfd89675 Mark reductions as fragile based on their final properties
We previously maintained a set of individual productions that were
involved in conflicts, but that was subtly incorrect because
we don't compare productions themselves when comparing parse items;
we only compare the parse items properties that could affect the
final reduce actions.
2017-07-21 09:54:24 -07:00
Max Brunsfeld
f33421c53e Fix incorrect node renames in the presence of extra tokens 2017-07-18 21:24:34 -07:00
Max Brunsfeld
10d28d4b56 Merge pull request #92 from tree-sitter/utf16-oob
Add test for UTF16 out-of-bound read
2017-07-18 17:24:31 -07:00
Phil Turnbull
52cec9ed39 Rework SpyInput buffer handling
SpyInput uses a fixed-size buffer and explicitly zeros memory which is good for
catching logic errors but defeats valgrind's memory tracking. Use a separate
buffer of exactly the correct size for each request. This correctly catches the
problem under valgrind:

```
==8694== Invalid read of size 2
==8694==    at 0x54EFFB: utf16_iterate (utf16.c:10)
==8694==    by 0x551126: ts_lexer__get_lookahead (lexer.c:54)
==8694==    by 0x5515CD: ts_lexer_start (lexer.c:154)
==8694==    by 0x54699F: parser(long,...)(long long) (parser.c:297)
==8694==    by 0x54788A: parser__get_lookahead (parser.c:439)
==8694==    by 0x54B2D3: parser__advance (parser.c:1150)
==8694==    by 0x54C2AA: parser_parse (parser.c:1348)
==8694==    by 0x53F063: ts_document_parse_with_options (document.c:136)
==8694==    by 0x53EF43: ts_document_parse (document.c:107)
==8694==    by 0x4AED11: {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}::operator()() const (document_test.cc:82)
==8694==    by 0x4B56B6: std::_Function_handler<void (), {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}>::_M_invoke(std::_Any_data const&) (functional:1871)
==8694==    by 0x40F8C5: std::function<void ()>::operator()() const (functional:2267)
==8694==  Address 0x5d08be0 is 0 bytes inside a block of size 1 alloc'd
==8694==    at 0x4C2E80F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8694==    by 0x507C3E: SpyInput::read(void*, unsigned int*) (spy_input.cc:66)
==8694==    by 0x55103D: ts_lexer__get_chunk (lexer.c:29)
==8694==    by 0x5515B6: ts_lexer_start (lexer.c:152)
==8694==    by 0x54699F: parser(long,...)(long long) (parser.c:297)
==8694==    by 0x54788A: parser__get_lookahead (parser.c:439)
==8694==    by 0x54B2D3: parser__advance (parser.c:1150)
==8694==    by 0x54C2AA: parser_parse (parser.c:1348)
==8694==    by 0x53F063: ts_document_parse_with_options (document.c:136)
==8694==    by 0x53EF43: ts_document_parse (document.c:107)
==8694==    by 0x4AED11: {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}::operator()() const (document_test.cc:82)
==8694==    by 0x4B56B6: std::_Function_handler<void (), {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}>::_M_invoke(std::_Any_data const&) (functional:1871)
```
2017-07-18 12:16:37 -07:00