Max Brunsfeld
1ca261c79b
Fix some regex parsing bugs
...
* Allow escape sequences to be used in ranges
* Don't give special meaning to dashes outside of character classes
2018-04-06 12:46:06 -07:00
Axel Hecht
345e344377
Tests for issue 158
2018-04-05 14:39:25 +02:00
Max Brunsfeld
dbe77e7199
Simplify testing-only ts_stack_iterate function
2018-03-29 17:50:07 -07:00
Max Brunsfeld
5520983144
Clean up Stack API
...
* Remove StackPopResult
* Rename top_state() -> state()
* Rename top_position() -> position()
* Improve docs
2018-03-29 17:37:54 -07:00
Max Brunsfeld
ee995c3d6b
Avoid redundant retains/releases by giving ts_stack_push move semantics
2018-03-29 17:18:43 -07:00
Max Brunsfeld
186f70649c
Consolidate the unify for detecting conflicting tokens
2018-03-28 10:03:09 -07:00
Max Brunsfeld
a8bc67ac42
Allow LookaheadSet::for_each to terminate early
2018-03-28 10:03:09 -07:00
Max Brunsfeld
43e14332ed
Avoid creating duplicate metadata rules
2018-03-28 10:03:09 -07:00
Max Brunsfeld
b7d0606fbd
Be less conservative in merging parse states with external tokens
...
Also, clean up the internal representation of external tokens
2018-03-16 16:00:40 -07:00
Max Brunsfeld
fe29173d5f
Merge pull request #142 from tree-sitter/fuzz-halt-recover
...
Add 'halt' and 'recover' modes to fuzzer
2018-03-14 09:28:58 -07:00
Phil Turnbull
269b1a0864
Update repo for libFuzzer
...
libFuzzer has now been broken out from LLVM and can be built separately
2018-03-12 13:08:58 -07:00
Max Brunsfeld
7183f8d3e7
Fix unit reduction elimination bugs
...
* Handle 'chains' of unit reductions starting in a single state
* Avoid eliminating rules which will later receive aliases
2018-03-12 07:54:18 -07:00
Max Brunsfeld
df2430b94c
Remove a C error recovery test temporarily
2018-03-08 14:18:12 -08:00
Max Brunsfeld
0810971f3e
🔥 symbol iterator API
...
This idea was never fully baked.
2018-03-08 14:16:37 -08:00
Max Brunsfeld
e927d02f43
Allow reusing leaf nodes unless the next leaf has changes
2018-03-07 17:44:54 -08:00
Max Brunsfeld
52087de4f0
Remove the concept of fragile reductions
...
They were a vestige of when Tree-sitter did sentential form-based
incremental parsing (as opposed to simply state matching). This was
elegant but not compatible with GLR as far as I could tell.
2018-03-02 14:51:54 -08:00
Max Brunsfeld
07fa3eb386
Fix capture of corpus descriptions in integration tests
2018-03-02 11:27:41 -08:00
Max Brunsfeld
a8d539023d
Handle subdirectories existing in parsers' examples folders
2018-03-02 11:04:08 -08:00
Phil Turnbull
bc192d95ca
Build fuzzer in 'halt' and 'recover' modes
...
Build each language fuzzer in two modes (halt_on_error=true and
halt_on_error=false) and use different timeouts for each fuzzer.
Also merge the run-fuzzer and reproduce scripts so they use identical
values of ASAN_OPTIONS/UBSAN_OPTIONS/etc0
2018-03-02 10:13:13 -08:00
Max Brunsfeld
d3ac345644
When parsing corpus, anchor header pattern to line start
2018-03-02 09:46:33 -08:00
Max Brunsfeld
82c7e170b3
Fix case where loop was created in the parse stack
...
Fixes #133
2018-03-02 09:05:20 -08:00
Max Brunsfeld
2daae48fe0
Handle conflicts in repeat rules after external tokens
...
Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2018-02-14 11:24:51 -08:00
Max Brunsfeld
facafcd6e4
Pass row/column position to input seek method
2018-02-14 07:31:49 -08:00
Max Brunsfeld
299a146b66
Balance repetition trees after parsing
2018-02-12 11:41:56 -08:00
Max Brunsfeld
8c29841adf
Represent repetitions with associative structure
2018-02-12 11:41:56 -08:00
Max Brunsfeld
46dcd53090
Do not insert missing tokens if halt_on_error option is passed
2018-01-24 14:04:55 -08:00
Max Brunsfeld
b520bdd2d5
Merge pull request #126 from tree-sitter/mb-fix-epsilon-rule-loophole
...
Don't allow an epsilon start rule if it is used in other rules
2018-01-23 17:19:55 -08:00
Max Brunsfeld
2e4f76c164
Don't allow an epsilon start rule if it is used in other rules
2018-01-23 17:05:28 -08:00
Max Brunsfeld
315dff3285
Add an API for getting a node's child index
2018-01-09 14:01:36 -08:00
Max Brunsfeld
f653f2b3bb
Add ts_node_first_{child,named_child}_for_byte methods
2018-01-09 13:44:59 -08:00
Max Brunsfeld
13adfe4927
Update error corpus
2017-12-29 16:21:04 -08:00
Max Brunsfeld
6304a3bcd1
Make it easier to run tests with debug graphs
2017-12-28 12:41:23 -08:00
Max Brunsfeld
addeb6c4c1
Allocate and free trees using an object pool
2017-12-27 10:34:29 -08:00
Max Brunsfeld
0e69da37a5
Return a character count from the lexer's get_column method
2017-12-20 16:26:38 -08:00
Max Brunsfeld
e5851fd9b9
Don't use non-existent \a syntax in test grammars
2017-12-13 12:21:28 -08:00
Max Brunsfeld
f426b61e7c
Fix expectation around preproc directive in C error test
2017-12-13 12:21:13 -08:00
Max Brunsfeld
fbcefe25f7
Avoid creating external tokens that start after they end
2017-12-07 11:50:27 -08:00
Max Brunsfeld
90629bd45a
Add some assertions to the fixture grammar tests
2017-12-07 11:50:27 -08:00
Max Brunsfeld
493db39363
Never move the start rule of a grammar into the lexical grammar
...
This preserves a useful invariant that the root node of the AST is never
a token.
2017-12-07 11:50:27 -08:00
Max Brunsfeld
36c2b685b9
Always invalidate old chunk of text when parsing after an edit
2017-10-04 15:09:46 -07:00
Max Brunsfeld
c0073c5b72
Update error corpus to reflect C grammar changes
2017-10-04 15:06:12 -07:00
Max Brunsfeld
d342b61ede
Re-enable JS fuzzing example test
2017-09-14 11:39:08 -07:00
Max Brunsfeld
9d67a98510
Merge pull request #103 from tree-sitter/python-assertion-failure
...
Assertion failure in parser__advance
2017-09-14 11:38:22 -07:00
Max Brunsfeld
d291af9a31
Refactor error comparisons
...
* Deal with mergeability outside of error comparison function
* Make `better_version_exists` function pure (don't halt other versions
as a side effect).
* Tweak error comparison logic
Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-13 16:38:15 -07:00
Phil Turnbull
d9a0fbc210
Add testcase for parser__advance assertion failure
...
The python testcase decodes to:
```
00000000 35 63 6f 6e 88 2c 29 33 2c 2c 2c 2c 63 6f 6e 88 |5con.,)3,,,,con.|
00000010 2c 2a 2c 3a 35 63 6f 6e 2c |,*,:5con,|
```
which triggers:
```
Assertion failed: ((uint32_t)0 < (&reduction.slices)->size), function parser__advance, file src/runtime/parser.c, line 1202.
```
2017-09-13 13:25:31 -04:00
Max Brunsfeld
65ed4281d4
Exclude zeros from speeds reported in benchmarks
2017-09-12 16:30:38 -07:00
Max Brunsfeld
99d048e016
Simplify error recovery; eliminate recovery states
...
The previous approach to error recovery relied on special error-recovery
states in the parse table. For each token T, there was an error recovery
state in which the parser looked for *any* token that could follow T.
Unfortunately, sometimes the set of tokens that could follow T contained
conflicts. For example, in JS, the token '}' can be followed by the
open-ended 'template_chars' token, but also by ordinary tokens like
'identifier'. So with the old algorithm, when recovering from an
unexpected '}' token, the lexer had no way to distinguish identifiers
from template_chars.
This commit drops the error recovery states. Instead, when we encounter
an unexpected token T, we recover from the error by finding a previous
state S in the stack in which T would be valid, popping all of the nodes
after S, and wrapping them in an error.
This way, the lexer is always invoked in a normal parse state, in which
it is looking for a non-conflicting set of tokens. Eliminating the error
recovery states also shrinks the lex state machine significantly.
Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-11 15:22:52 -07:00
Max Brunsfeld
8b3941764f
Make outstanding_allocation_indices return a vector, not a set
2017-09-07 17:48:44 -07:00
Max Brunsfeld
9d668c5004
Move incompatible token map into LexTableBuilder
2017-08-31 15:46:37 -07:00
Max Brunsfeld
4daf22ba0c
Read files in binary mode in tests
2017-08-09 10:07:03 -07:00