Commit graph

305 commits

Author SHA1 Message Date
Max Brunsfeld
1fece241aa Add ts_parser_set_enabled API 2018-05-21 17:28:12 -07:00
Max Brunsfeld
3c01382b95 Avoid warnings about repeated typedefs 2018-05-17 17:59:50 -07:00
Max Brunsfeld
5ec3769cb4 Make ts_tree_cursor_current_node take the cursor as const 2018-05-17 14:24:32 -07:00
Max Brunsfeld
074c051094 Change the TSInputEdit struct to work with old/new start and end positions 2018-05-17 11:14:51 -07:00
Max Brunsfeld
95be6e3bee Make it clear which field of TSNode can be used as a unique id 2018-05-16 16:20:33 -07:00
Max Brunsfeld
e3670be42f Avoid one heap allocation when instantiating a TSTreeCursor 2018-05-16 16:05:08 -07:00
Max Brunsfeld
6fc8d9871c Hide the details of TSNode's fields in the public API 2018-05-16 15:44:04 -07:00
Max Brunsfeld
ebddb1a0b5 Add ts_tree_cursor_goto_first_child_for_byte method
Atom needs this for efficiently seeking to the leaf node at a given position,
visiting all of its ancestors along the way.
2018-05-16 13:51:21 -07:00
Max Brunsfeld
fe53506175 Declare subtrees as const wherever possible
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-11 15:06:13 -07:00
Max Brunsfeld
bf1bb1604f Rename TSExternalTokenState -> ExternalScannerState 2018-05-11 12:57:41 -07:00
Max Brunsfeld
199a94cc26 Allow the parser to print dot graphs to any file 2018-05-11 12:48:51 -07:00
Max Brunsfeld
e75ecd1bb1 Rework API completely 2018-05-11 10:46:13 -07:00
Max Brunsfeld
666dfb76d2 Remove document parameter from ts_node_type, ts_node_string
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-09 16:47:47 -07:00
Max Brunsfeld
92255bbfdd Remove document parameter from ts_node_type, ts_node_string
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-09 15:28:28 -07:00
Max Brunsfeld
973e4a44f0 Start work on removing parent pointers
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2018-05-09 12:22:19 -07:00
Max Brunsfeld
e917756ad1 Remove depends_on_lookahead field from parse table entries
This simplifies the logic for determining whether a token is reusable
and makes it more conservative. It should fix some incremental parsing
bugs that are being caught by the randomized tests on CI.
2018-03-28 10:58:33 -07:00
Max Brunsfeld
0810971f3e 🔥 symbol iterator API
This idea was never fully baked.
2018-03-08 14:16:37 -08:00
Max Brunsfeld
c0cc35ff07 Create separate lexer function for keywords 2018-03-07 12:00:26 -08:00
Max Brunsfeld
16cdd2ffbe Bump language ABI version after removing fragile bit from actions 2018-03-05 17:13:11 -08:00
Max Brunsfeld
52087de4f0 Remove the concept of fragile reductions
They were a vestige of when Tree-sitter did sentential form-based
incremental parsing (as opposed to simply state matching). This was
elegant but not compatible with GLR as far as I could tell.
2018-03-02 14:51:54 -08:00
Max Brunsfeld
facafcd6e4 Pass row/column position to input seek method 2018-02-14 07:31:49 -08:00
Max Brunsfeld
8c29841adf Represent repetitions with associative structure 2018-02-12 11:41:56 -08:00
Max Brunsfeld
315dff3285 Add an API for getting a node's child index 2018-01-09 14:01:36 -08:00
Max Brunsfeld
f653f2b3bb Add ts_node_first_{child,named_child}_for_byte methods 2018-01-09 13:44:59 -08:00
Max Brunsfeld
d3c85f288d Start work on repairing errors by inserting missing tokens 2017-12-29 15:11:00 -08:00
Max Brunsfeld
0e69da37a5 Return a character count from the lexer's get_column method 2017-12-20 16:26:38 -08:00
Max Brunsfeld
fcff16cb86 Add get_column method to lexer 2017-12-19 17:54:15 -08:00
Max Brunsfeld
b0fdc33f73 Remove 'extra' and 'structural' booleans from symbol metadata 2017-09-14 12:07:46 -07:00
Max Brunsfeld
037933ffc5 Bump LANGUAGE_VERSION constant due to incompatible parse table change 2017-09-14 11:09:26 -07:00
Max Brunsfeld
99d048e016 Simplify error recovery; eliminate recovery states
The previous approach to error recovery relied on special error-recovery
states in the parse table. For each token T, there was an error recovery
state in which the parser looked for *any* token that could follow T.
Unfortunately, sometimes the set of tokens that could follow T contained
conflicts. For example, in JS, the token '}' can be followed by the
open-ended 'template_chars' token, but also by ordinary tokens like
'identifier'. So with the old algorithm, when recovering from an
unexpected '}' token, the lexer had no way to distinguish identifiers
from template_chars.

This commit drops the error recovery states. Instead, when we encounter
an unexpected token T, we recover from the error by finding a previous
state S in the stack in which T would be valid, popping all of the nodes
after S, and wrapping them in an error.

This way, the lexer is always invoked in a normal parse state, in which
it is looking for a non-conflicting set of tokens. Eliminating the error
recovery states also shrinks the lex state machine significantly.

Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-11 15:22:52 -07:00
Max Brunsfeld
e6b43700b9 Get generated parsers compiling and loading properly on windows 2017-08-08 16:47:51 -07:00
Max Brunsfeld
94dc703bfc Require that grammars' start rules be visible 2017-08-04 17:07:37 -07:00
Max Brunsfeld
cb5fe80348 Rename RENAME rule to ALIAS, allow it to create anonymous nodes 2017-07-31 16:41:11 -07:00
Max Brunsfeld
1df41a9107 Avoid anonymous struct to silence gcc's override-init warning (again) 2017-07-21 10:17:54 -07:00
Max Brunsfeld
afb499bf2e Handle rename symbols in ts_language APIs 2017-07-18 12:01:52 -07:00
Max Brunsfeld
9a04231ab1 Remove length restriction in external scanner serialization API 2017-07-17 17:12:36 -07:00
Max Brunsfeld
1a195d44bb Whoops, dynamic precedence needs a sign 2017-07-14 11:06:16 -07:00
Max Brunsfeld
b3a72954ff Introduce RENAME rule type 2017-07-13 17:17:22 -07:00
Max Brunsfeld
107feb7960 Bump the language version number after adding dynamic precedences 2017-07-06 15:58:29 -07:00
Max Brunsfeld
d8e9d04fe7 Add PREC_DYNAMIC rule for resolving runtime ambiguities 2017-07-06 15:24:45 -07:00
Max Brunsfeld
17bc3dfaf7 Add a benchmark command
This command measures the speed of parsing each grammar's examples.
It also uses each grammar to parse all of the *other* grammars' examples
in order to measure error recovery performance with fairly large files.
2017-07-05 14:14:38 -07:00
Max Brunsfeld
c66fddd3aa Add TSInput option to measure columns in bytes not characters 2017-06-15 16:35:34 -07:00
Max Brunsfeld
a98d449d88 Add an option to immediately halt on syntax error 2017-05-01 13:50:49 -07:00
Rob Rix
3a888b1623 Define a function providing the type of a given symbol. 2017-04-12 09:47:51 -04:00
Rob Rix
4b1f69142d Define a symbol type enum. 2017-04-12 09:46:01 -04:00
Max Brunsfeld
db4b9ebc7c Implement Rule as a union rather than an abstract base class 2017-03-17 13:29:31 -07:00
Max Brunsfeld
d222dbb9fd Allow lexer to accept tokens that ended at previous positions
* Track lookahead in each tree
* Add 'mark_end' API that external scanners can use
2017-03-13 17:06:52 -07:00
Rob Rix
c230658bae Add public API to set the input string with explicit length. 2017-02-10 09:10:31 -05:00
Max Brunsfeld
4131e1c16e Return an error when external token name matches non-terminal rule 2017-01-31 11:36:51 -08:00
Max Brunsfeld
d853b6504d Add version number to TSLanguage structs 2017-01-31 10:21:47 -08:00