Timothy Clem
ab00f1b0da
Add support for \W and \D negated character classes too
2017-01-31 15:03:48 -08:00
Timothy Clem
902b7f9745
Allow \S for negated whitespace regex shorthand
2017-01-31 14:45:28 -08:00
Max Brunsfeld
0a6e5f9ee6
Fix some build warnings on gcc
2017-01-31 11:46:28 -08:00
Max Brunsfeld
4131e1c16e
Return an error when external token name matches non-terminal rule
2017-01-31 11:36:51 -08:00
Max Brunsfeld
60f6998485
Rename generated language functions to e.g. tree_sitter_python
...
They used to be called e.g. `ts_language_python`. Now that there
are APIs that deal with the `TSLanguage` objects themselves, such
as `ts_language_symbol_count`, the old names were a little confusing.
2017-01-31 10:29:31 -08:00
Max Brunsfeld
d853b6504d
Add version number to TSLanguage structs
2017-01-31 10:21:47 -08:00
Max Brunsfeld
672d491775
Fix errors in management of external scanner's most recent state
2017-01-30 22:04:46 -08:00
Max Brunsfeld
dc6598e07e
Include external token states in stack debug graphs
2017-01-30 21:58:27 -08:00
Max Brunsfeld
896254eea5
Fix error in changed ranges calculation
...
There was an error in the way that we calculate the reference
scope sequences that are used as the basis for assertions about
changed ranges in randomized tests. The error caused some
characters' scopes to not be checked. This corrects the reference
implementation and fixes a previously uncaught bug in the
implementation of `tree_path_get_changed_ranges`.
Previously, when iterating over the old and new trees, we would
only perform comparisons of visible nodes. This resulted in a failure
to do any comparison for portions of the text in which there were
trailing invisible child nodes (e.g. trailing `_line_break` nodes
inside `statement` nodes in the JavaScript grammar).
Now, we additionally perform comparisons at invisible leaf nodes,
based on their lowest visible ancestor.
2017-01-27 23:47:34 -08:00
Max Brunsfeld
36608180d2
Store external token states in the parse stack
2017-01-08 22:06:05 -08:00
Max Brunsfeld
3a4daace26
Move reusable node functions to their own file
2017-01-05 10:07:27 -08:00
Max Brunsfeld
12cd2132ff
Add test for retrieving last external token state in a Tree
2017-01-04 21:23:04 -08:00
Max Brunsfeld
d57043b665
Add ability to store external token state per stack version
2017-01-04 21:22:23 -08:00
Max Brunsfeld
2fa7b453c8
Restore external scanner's state only after repositioning lexer
...
Also, properly identify the leaf node with the external token state
2016-12-21 13:59:56 -08:00
Max Brunsfeld
3706678b89
Pass const TSExternalTokenState to external scanner deserialize hook
2016-12-21 13:58:18 -08:00
Max Brunsfeld
4136dad5de
Avoid referencing invalid union member in tree_path_descend
2016-12-21 13:21:21 -08:00
Max Brunsfeld
1595a02692
Avoid referencing invalid union member in tree_set_children
2016-12-21 12:23:24 -08:00
Max Brunsfeld
34a65f588d
Tweak naming and organization of external-scanner related language fields
2016-12-21 11:24:41 -08:00
Max Brunsfeld
42c41c158c
Refactor logic for handling shared internal/external tokens
2016-12-21 10:49:55 -08:00
Max Brunsfeld
e6c82ead2c
Start work toward maintaining external scanner's state during incremental parses
2016-12-20 17:06:20 -08:00
Max Brunsfeld
2b3da512a4
Add serialize, deserialize and reset callbacks to external scanners
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-12-20 13:12:01 -08:00
Max Brunsfeld
a1770ce844
Allow external tokens to be used as extras
2016-12-12 22:06:01 -08:00
Max Brunsfeld
0e595346be
Make lexer log output easier to read
2016-12-09 13:33:37 -08:00
Max Brunsfeld
10b51a05a1
Allow external scanners to refer to (and return) internally-defined tokens
...
Tokens that are defined in the grammar's rules may now be included in the
externals list also, so that external scanners can check if they are valid
lookaheads or not, and if so, can return them to the parser if needed.
2016-12-09 13:32:58 -08:00
Max Brunsfeld
7f6ec0131d
Remove duplication between parser_destroy and parser_set_language
2016-12-06 10:12:49 -08:00
Max Brunsfeld
83514293b5
Allow external tokens to be either visible or hidden
2016-12-05 17:26:11 -08:00
Max Brunsfeld
1251ff2e30
Consider externals to be named, not anonymous
2016-12-05 17:09:22 -08:00
Max Brunsfeld
c4fe8ded95
Remove state argument to Lexer advance method
2016-12-05 16:36:34 -08:00
Max Brunsfeld
c16b6b2059
Run external scanners during error recovery
2016-12-05 11:50:24 -08:00
Max Brunsfeld
49d25bd0f8
Remove EXTERNAL_TOKEN grammar rule type
2016-12-04 15:02:32 -08:00
Max Brunsfeld
cf0d8abea1
Destroy external scanner when destroying Parser
2016-12-04 14:18:30 -08:00
Max Brunsfeld
d72b49316b
Handle external tokens in apply_transitive_closure
2016-12-04 10:40:32 -08:00
Max Brunsfeld
0f8e130687
Call external scanner functions when lexing
2016-12-02 22:03:48 -08:00
Max Brunsfeld
c966af0412
Start work on external tokens
2016-12-02 16:24:19 -08:00
Max Brunsfeld
be9e79db1b
Avoid incorrect application of precedence
2016-12-01 10:24:06 -08:00
Max Brunsfeld
996ca91e70
Disallow syntax rules that match the empty string (for now)
2016-11-30 23:19:54 -08:00
Max Brunsfeld
101e304a8a
Avoid unnecessary lookahead set mutations in ParseItemSetBuilder
2016-11-20 21:41:36 -08:00
Max Brunsfeld
06215607d1
Precompute transitive closure contributions by grammar symbol
2016-11-20 11:49:55 -08:00
Max Brunsfeld
5332fd3418
Fix build warnings
2016-11-19 20:47:43 -08:00
Max Brunsfeld
6cf4ccb840
Represent rule metadata as a struct, not a map
2016-11-19 13:59:34 -08:00
Max Brunsfeld
cab1bd3ac5
Make conflict messages explicit about precedence combinations
2016-11-18 17:05:16 -08:00
Max Brunsfeld
5924285e69
🎨
2016-11-18 16:14:05 -08:00
Max Brunsfeld
32387400c6
Rework LR conflict resolution
...
* Unify precedence/associativity-based resolution with the
search for a whitelisted conflict
* Improve conflict error messages
2016-11-18 13:50:55 -08:00
Max Brunsfeld
6935f1d26f
Use hash_combine everywhere
2016-11-16 11:46:22 -08:00
Max Brunsfeld
6cfd009503
Compute parse state group signature based on the item set
2016-11-16 10:21:30 -08:00
Max Brunsfeld
42d37656ea
Optimize remove_duplicate_parse_states method
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-11-15 17:51:52 -08:00
Max Brunsfeld
e7217f1bac
Clean up some methods in parser.c
2016-11-14 17:25:55 -08:00
Max Brunsfeld
535879a2bd
Represent byte, char and tree counts as 32 bit numbers
...
The parser spends the majority of its time allocating and freeing trees and stack nodes.
Also, the memory footprint of the AST is a significant concern when using tree-sitter
with large files. This library is already unlikely to work very well with source files
larger than 4GB, so representing rows, columns, byte lengths and child indices as
unsigned 32 bit integers seems like the right choice.
2016-11-14 12:19:13 -08:00
Max Brunsfeld
8edb8df530
Remove extraneous Language methods
2016-11-14 10:35:33 -08:00
Max Brunsfeld
1118a9142a
Introduce Symbol::Index type alias
2016-11-14 10:25:26 -08:00