Max Brunsfeld
93d7a75b09
Suppress one unnecessary type of error recovery variation
...
If we already have a stack version in which, for example,
a `function_call` is skipped, don't create another stack
version in which that `function_call` is reduced to an
`expression`, and then the `expression` is skipped. That
doesn't improve the error recovery at all, but adds to the
branching factor of the parse stack and makes things harder
to debug.
2017-02-07 22:07:56 -08:00
Max Brunsfeld
5b23a8fca9
Update error corpus to reflect slightly different recoveries
2017-02-07 17:49:15 -08:00
Max Brunsfeld
343887c1dd
Fix miscounting of extra tokens when repairing errors
2017-02-06 17:43:07 -08:00
Max Brunsfeld
60f6998485
Rename generated language functions to e.g. tree_sitter_python
...
They used to be called e.g. `ts_language_python`. Now that there
are APIs that deal with the `TSLanguage` objects themselves, such
as `ts_language_symbol_count`, the old names were a little confusing.
2017-01-31 10:29:31 -08:00
Max Brunsfeld
0a286d41f3
Add python error recovery tests
2017-01-08 22:06:36 -08:00
Max Brunsfeld
2b3da512a4
Add serialize, deserialize and reset callbacks to external scanners
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-12-20 13:12:01 -08:00
Max Brunsfeld
a1770ce844
Allow external tokens to be used as extras
2016-12-12 22:06:01 -08:00
Max Brunsfeld
10b51a05a1
Allow external scanners to refer to (and return) internally-defined tokens
...
Tokens that are defined in the grammar's rules may now be included in the
externals list also, so that external scanners can check if they are valid
lookaheads or not, and if so, can return them to the parser if needed.
2016-12-09 13:32:58 -08:00
Max Brunsfeld
c4fe8ded95
Remove state argument to Lexer advance method
2016-12-05 16:36:34 -08:00
Max Brunsfeld
0f8e130687
Call external scanner functions when lexing
2016-12-02 22:03:48 -08:00
Max Brunsfeld
c966af0412
Start work on external tokens
2016-12-02 16:24:19 -08:00
Max Brunsfeld
d627042fa6
Fix javascript error test
...
A single line with two function declarations now parses
successfully, so to create the desired error recovery
scenario, wrap the two functions in an assignment
2016-11-30 23:19:34 -08:00
Max Brunsfeld
fad7294ba4
Store shift states for non-terminals directly in the main parse table
2016-11-14 08:36:06 -08:00
Max Brunsfeld
b76574e01c
Handle ambiguities between extra and non-extra tokens using normal GLR splitting
2016-09-06 10:22:16 -07:00
Max Brunsfeld
c1b6d9f5be
Improve error comparison criteria
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-09-01 11:39:23 -07:00
Max Brunsfeld
0faae52132
Fix some inconsistencies in error cost calculation
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-08-31 10:51:59 -07:00
Max Brunsfeld
1d617ab5e0
Allow reductions based on error token, skipping some preceding content
2016-08-29 17:34:51 -07:00
Max Brunsfeld
31d1160e21
Base error costs on top-level trees skipped and lines of text skipped
...
Rather than on the total number of tokens skipped
2016-08-29 17:06:23 -07:00
Max Brunsfeld
e947d7e2ad
Adjust test assertions for subtly different recoveries
2016-08-29 11:23:52 -07:00
Max Brunsfeld
1b8843dd41
Perform all possible reductions recursively upon detecting an error
2016-08-29 11:23:35 -07:00
Max Brunsfeld
9538b5b879
Don't count extra trees toward stack versions' error costs
2016-06-26 22:46:40 -07:00
Max Brunsfeld
9972709e43
Allow error recovery to skip non-terminal nodes after error detection
2016-06-24 10:28:05 -07:00
Max Brunsfeld
94721c7ec0
Rewind and re-tokenize in error mode after detecting an error
2016-06-17 21:26:03 -07:00
Max Brunsfeld
e70547cd11
Allow recoveries that skip leading children of invisible trees
...
Before this, errors could only be recovered by skipping internal children.
2016-06-14 14:48:35 -07:00
Max Brunsfeld
9b67b21dcd
Fix an outdated error corpus entry
2016-06-02 14:04:10 -07:00
Max Brunsfeld
e1a3a1daeb
Import error corpus entries from grammar repos
...
Now that error recovery requires no input for the grammar author, it shouldn't
be tested in the individual grammar repos.
2016-05-28 20:12:02 -07:00
Max Brunsfeld
0f7dbea9a3
Unify test targets, use externally defined languages as fixtures
2016-01-15 11:19:24 -08:00
Max Brunsfeld
ad4089a4bf
Move anonymous tokens grammar into integration spec
2016-01-14 10:35:03 -08:00
Max Brunsfeld
49f393b75e
Merge pull request #22 from maxbrunsfeld/c-compiler-api
...
Simplify the compiler API
2016-01-13 21:08:41 -08:00
Max Brunsfeld
d4632ab9a9
Make the compile function plain C and take a JSON grammar
2016-01-11 12:33:48 -08:00
Max Brunsfeld
36870bfced
Make Grammar a simple struct
2016-01-08 15:51:30 -08:00
Max Brunsfeld
e59f6294cb
Fix bug in lexical state de-duping
2015-12-30 11:15:36 -08:00
Max Brunsfeld
4b04afac5e
Control lexer's error-mode via explicit boolean argument
...
Previously, the lexer would operate in error-mode (ignoring any garbage input
until it found a valid token) if it was invoked in the 'error' state. Now that
the error state is deduped with other lexical states, the lexer might be invoked
in that state even when error-mode is not intended. This adds a third argument
to `ts_lex` that explicitly sets the error-mode.
This bug was unlikely to occur in any real grammars, but it caused the
node-tree-sitter-compiler test suite to fail for some grammars with only one
rule.
2015-12-30 09:43:12 -08:00
Max Brunsfeld
939476c947
When removing duplicate lex states, update the error state too
...
Now, instead of being stored as a separate field on the parse table, the error
state is just the first state in the states vector.
2015-12-29 21:02:24 -08:00
Max Brunsfeld
97a281502e
Store parse table more compactly
2015-12-29 11:27:41 -08:00
Max Brunsfeld
386b124866
Ensure that there are no duplicate lex states
2015-12-20 15:46:13 -08:00
Max Brunsfeld
c9db5499e9
Remove uninteresting corpus entries
2015-12-18 13:46:24 -08:00
Max Brunsfeld
66460b24fd
Use more greek letters in arithmetic corpus
2015-12-18 13:46:10 -08:00
Max Brunsfeld
1c6ad5f7e4
Rename ubiquitous_tokens -> extra_tokens in compiler API
...
They were already called this in the runtime code.
'Extra' is just easier to say.
2015-12-17 15:50:50 -08:00
Max Brunsfeld
c495076adb
Record in parse table which actions can hide splits
...
Suppose a parse state S has multiple actions for a terminal lookahead symbol A.
Then during incremental parsing, while in state S, the parser should not
reuse a non-terminal lookahead B where FIRST(B) contains A, because reusing B
might prematurely discard one of the possible actions that a batch parser
would have attempted in state S, upon seeing A as a lookahead.
2015-12-17 13:11:56 -08:00
Max Brunsfeld
66144dc28e
Treat tokens that are sometimes extra as fragile
2015-12-16 20:04:45 -08:00
Max Brunsfeld
9bff4d0b06
Add concise method syntax to javascript fixture grammar
...
This exposes an ambiguity handling bug that I discovered while adding ES6 support to
tree-sitter-javascript
2015-12-15 22:25:48 -08:00
Max Brunsfeld
d713054d61
Record which tokens are fragile when lexing
2015-12-10 21:05:54 -08:00
Max Brunsfeld
75f31a79a3
Treat reduce actions with different production IDs as distinct
2015-12-10 13:00:26 -08:00
Max Brunsfeld
76e4599d5e
For now, allow any expression as an assignment LHS
2015-12-06 14:14:17 -08:00
Max Brunsfeld
863cabc827
Don't include trailing ubiquitous tokens as children when reducing
2015-12-02 15:31:15 -08:00
Max Brunsfeld
64e56f5acc
Add assignments to C grammar
...
This creates another source of ambiguity: assignments vs initializations
for declarations. This is good for testing ambiguity handling
2015-12-02 15:10:24 -08:00
Max Brunsfeld
ad619d95f6
Add 'extra' field to symbol metadata
...
This stores whether a symbol is only ever used as a ubiquitous token. This will
allow ubiquitous nodes to be reused more effectively: if they are always
ubiquitous, then they can be reused immediately, and otherwise, they must be
broken down in case they need to be used structurally.
2015-12-02 15:10:24 -08:00
Max Brunsfeld
f08554e958
Replace NodeType enum with SymbolMetadata bitfield
...
This will allow storing other metadata about symbols, like if they
only appear as ubiquitous tokens
2015-12-02 15:10:24 -08:00
Max Brunsfeld
40a90b551a
Allow error recovery to look all the way to the bottom of the stack
...
Previously, there was a bug where the first node on the stack
would never be popped
2015-11-11 16:59:41 -08:00