Commit graph

1024 commits

Author SHA1 Message Date
Max Brunsfeld
b3edd8f749 Remove use of shared_ptr in choice, repeat, and seq factories 2017-03-17 14:28:13 -07:00
Max Brunsfeld
d9fb863bea Fix build errors w/ gcc 2017-03-17 14:03:49 -07:00
Max Brunsfeld
416cbb9def Add missing cassert includes 2017-03-17 13:54:40 -07:00
Max Brunsfeld
90d21adf3b Format make_visitor helper consistently w/ project 2017-03-17 13:37:26 -07:00
Max Brunsfeld
db4b9ebc7c Implement Rule as a union rather than an abstract base class 2017-03-17 13:29:31 -07:00
Max Brunsfeld
d222dbb9fd Allow lexer to accept tokens that ended at previous positions
* Track lookahead in each tree
* Add 'mark_end' API that external scanners can use
2017-03-13 17:06:52 -07:00
Max Brunsfeld
f04d7c5860 Handle unused tokens 2017-03-09 21:16:37 -08:00
Max Brunsfeld
c79fae6d21 Clean up extract_tokens function 2017-03-09 21:16:20 -08:00
Max Brunsfeld
f049d5d94c Make ParseItem a struct, not a class 2017-03-08 21:06:30 -08:00
Max Brunsfeld
64e9230071 Use LexTableBuilder to detect conflicts between tokens more correctly 2017-03-08 12:47:38 -08:00
Max Brunsfeld
abf8a4f2c2 🎨 2017-03-01 22:15:26 -08:00
Max Brunsfeld
686dc0997c Avoid introducing certain lexical conflicts during parse state merging
The current pretty conservative approach is to avoid merging parse states which
would cause a pair tokens to co-exist for the first time in any parse state,
where the two tokens can start with the same character and at least one of the
tokens can contain a character which is part of the grammar's separators.
2017-02-27 22:54:38 -08:00
Max Brunsfeld
3c8e6f9987 Restructure parse state merging logic
* Remove remnants of templatized remove_duplicate_states function
* Rename recovery_tokens function to get_compatible_tokens and augment it
  also compute pairs of tokens which could potentially be incompatible
2017-02-26 12:23:48 -08:00
Max Brunsfeld
df520635c6 Prevent crash due to huge number of possible paths through parse stack 2017-02-20 14:34:10 -08:00
Max Brunsfeld
cefc57fe86 Move error cost comparisons into their own source file 2017-02-19 21:54:06 -08:00
Max Brunsfeld
5b4e6df3ff Don't mark error nodes created in the error state as extras 2017-02-19 21:54:06 -08:00
Max Brunsfeld
c14a776a3d Avoid including trailing extra tokens within error nodes unnecessarily 2017-02-19 21:21:54 -08:00
Max Brunsfeld
135d8ef4e0 Merge pull request #58 from tree-sitter/reduce-error-recovery-branching
Reduce the branching factor of the parse stack during error recovery
2017-02-18 11:34:09 -08:00
Rob Rix
638aa87e42 Pass through to ts_string_input_make_with_length. 2017-02-10 09:27:21 -05:00
Rob Rix
eab518e5da Semicolon shame. 2017-02-10 09:20:58 -05:00
Rob Rix
c230658bae Add public API to set the input string with explicit length. 2017-02-10 09:10:31 -05:00
Rob Rix
e6927238e1 Construct TSStringInput with explicit length. 2017-02-10 09:10:06 -05:00
Max Brunsfeld
93d7a75b09 Suppress one unnecessary type of error recovery variation
If we already have a stack version in which, for example,
a `function_call` is skipped, don't create another stack
version in which that `function_call` is reduced to an
`expression`, and then the `expression` is skipped. That
doesn't improve the error recovery at all, but adds to the
branching factor of the parse stack and makes things harder
to debug.
2017-02-07 22:07:56 -08:00
Max Brunsfeld
819b63e78d Merge pull request #57 from tree-sitter/fix-error-recovery-bugs
Fix error recovery bug when error parent node contains extra tokens
2017-02-07 21:11:16 -08:00
Max Brunsfeld
b01c5404eb Ensure error_end_position variable is initialized 2017-02-07 17:48:53 -08:00
Max Brunsfeld
343887c1dd Fix miscounting of extra tokens when repairing errors 2017-02-06 17:43:07 -08:00
Timothy Clem
ab00f1b0da Add support for \W and \D negated character classes too 2017-01-31 15:03:48 -08:00
Timothy Clem
902b7f9745 Allow \S for negated whitespace regex shorthand 2017-01-31 14:45:28 -08:00
Max Brunsfeld
0a6e5f9ee6 Fix some build warnings on gcc 2017-01-31 11:46:28 -08:00
Max Brunsfeld
4131e1c16e Return an error when external token name matches non-terminal rule 2017-01-31 11:36:51 -08:00
Max Brunsfeld
60f6998485 Rename generated language functions to e.g. tree_sitter_python
They used to be called e.g. `ts_language_python`. Now that there
are APIs that deal with the `TSLanguage` objects themselves, such
as `ts_language_symbol_count`, the old names were a little confusing.
2017-01-31 10:29:31 -08:00
Max Brunsfeld
d853b6504d Add version number to TSLanguage structs 2017-01-31 10:21:47 -08:00
Max Brunsfeld
672d491775 Fix errors in management of external scanner's most recent state 2017-01-30 22:04:46 -08:00
Max Brunsfeld
dc6598e07e Include external token states in stack debug graphs 2017-01-30 21:58:27 -08:00
Max Brunsfeld
896254eea5 Fix error in changed ranges calculation
There was an error in the way that we calculate the reference
scope sequences that are used as the basis for assertions about
changed ranges in randomized tests. The error caused some
characters' scopes to not be checked. This corrects the reference
implementation and fixes a previously uncaught bug in the
implementation of `tree_path_get_changed_ranges`.

Previously, when iterating over the old and new trees, we would
only perform comparisons of visible nodes. This resulted in a failure
to do any comparison for portions of the text in which there were
trailing invisible child nodes (e.g. trailing `_line_break` nodes
inside `statement` nodes in the JavaScript grammar).

Now, we additionally perform comparisons at invisible leaf nodes,
based on their lowest visible ancestor.
2017-01-27 23:47:34 -08:00
Max Brunsfeld
36608180d2 Store external token states in the parse stack 2017-01-08 22:06:05 -08:00
Max Brunsfeld
3a4daace26 Move reusable node functions to their own file 2017-01-05 10:07:27 -08:00
Max Brunsfeld
12cd2132ff Add test for retrieving last external token state in a Tree 2017-01-04 21:23:04 -08:00
Max Brunsfeld
d57043b665 Add ability to store external token state per stack version 2017-01-04 21:22:23 -08:00
Max Brunsfeld
2fa7b453c8 Restore external scanner's state only after repositioning lexer
Also, properly identify the leaf node with the external token state
2016-12-21 13:59:56 -08:00
Max Brunsfeld
3706678b89 Pass const TSExternalTokenState to external scanner deserialize hook 2016-12-21 13:58:18 -08:00
Max Brunsfeld
4136dad5de Avoid referencing invalid union member in tree_path_descend 2016-12-21 13:21:21 -08:00
Max Brunsfeld
1595a02692 Avoid referencing invalid union member in tree_set_children 2016-12-21 12:23:24 -08:00
Max Brunsfeld
34a65f588d Tweak naming and organization of external-scanner related language fields 2016-12-21 11:24:41 -08:00
Max Brunsfeld
42c41c158c Refactor logic for handling shared internal/external tokens 2016-12-21 10:49:55 -08:00
Max Brunsfeld
e6c82ead2c Start work toward maintaining external scanner's state during incremental parses 2016-12-20 17:06:20 -08:00
Max Brunsfeld
2b3da512a4 Add serialize, deserialize and reset callbacks to external scanners
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-12-20 13:12:01 -08:00
Max Brunsfeld
a1770ce844 Allow external tokens to be used as extras 2016-12-12 22:06:01 -08:00
Max Brunsfeld
0e595346be Make lexer log output easier to read 2016-12-09 13:33:37 -08:00
Max Brunsfeld
10b51a05a1 Allow external scanners to refer to (and return) internally-defined tokens
Tokens that are defined in the grammar's rules may now be included in the
externals list also, so that external scanners can check if they are valid
lookaheads or not, and if so, can return them to the parser if needed.
2016-12-09 13:32:58 -08:00