Rob Rix
3a888b1623
Define a function providing the type of a given symbol.
2017-04-12 09:47:51 -04:00
Rob Rix
4b1f69142d
Define a symbol type enum.
2017-04-12 09:46:01 -04:00
Max Brunsfeld
db4b9ebc7c
Implement Rule as a union rather than an abstract base class
2017-03-17 13:29:31 -07:00
Max Brunsfeld
d222dbb9fd
Allow lexer to accept tokens that ended at previous positions
...
* Track lookahead in each tree
* Add 'mark_end' API that external scanners can use
2017-03-13 17:06:52 -07:00
Rob Rix
c230658bae
Add public API to set the input string with explicit length.
2017-02-10 09:10:31 -05:00
Max Brunsfeld
4131e1c16e
Return an error when external token name matches non-terminal rule
2017-01-31 11:36:51 -08:00
Max Brunsfeld
d853b6504d
Add version number to TSLanguage structs
2017-01-31 10:21:47 -08:00
Max Brunsfeld
3706678b89
Pass const TSExternalTokenState to external scanner deserialize hook
2016-12-21 13:58:18 -08:00
Max Brunsfeld
34a65f588d
Tweak naming and organization of external-scanner related language fields
2016-12-21 11:24:41 -08:00
Max Brunsfeld
2b3da512a4
Add serialize, deserialize and reset callbacks to external scanners
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-12-20 13:12:01 -08:00
Max Brunsfeld
c4fe8ded95
Remove state argument to Lexer advance method
2016-12-05 16:36:34 -08:00
Max Brunsfeld
0f8e130687
Call external scanner functions when lexing
2016-12-02 22:03:48 -08:00
Max Brunsfeld
c966af0412
Start work on external tokens
2016-12-02 16:24:19 -08:00
Max Brunsfeld
996ca91e70
Disallow syntax rules that match the empty string (for now)
2016-11-30 23:19:54 -08:00
Max Brunsfeld
535879a2bd
Represent byte, char and tree counts as 32 bit numbers
...
The parser spends the majority of its time allocating and freeing trees and stack nodes.
Also, the memory footprint of the AST is a significant concern when using tree-sitter
with large files. This library is already unlikely to work very well with source files
larger than 4GB, so representing rows, columns, byte lengths and child indices as
unsigned 32 bit integers seems like the right choice.
2016-11-14 12:19:13 -08:00
Max Brunsfeld
fad7294ba4
Store shift states for non-terminals directly in the main parse table
2016-11-14 08:36:06 -08:00
Max Brunsfeld
4106ecda43
Remove logic for recovering from OOM
2016-11-04 09:18:38 -07:00
Max Brunsfeld
e53beb66c9
Avoid anonymous nested struct to silence override-init warnings
2016-10-26 11:10:56 -07:00
Max Brunsfeld
eed54d95e1
Merge branch 'master' into changed-ranges
2016-10-16 21:10:25 -07:00
Max Brunsfeld
e149d94ff5
Remove generated parsers' dependency on runtime.h
2016-10-05 14:02:49 -07:00
Max Brunsfeld
00528e50ce
Change edit API to be byte-based
2016-09-13 13:08:52 -07:00
Max Brunsfeld
cc62fe0375
Represent Lengths in terms of Points
2016-09-09 21:11:02 -07:00
Max Brunsfeld
131bbee160
Rename parse_and_diff -> parse_and_get_changed_ranges
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-09-08 17:51:34 -07:00
Max Brunsfeld
fce8d57152
Start work on document_parse_and_diff API
2016-09-08 17:51:20 -07:00
Max Brunsfeld
a6a08dde31
Rename ts_node_name -> ts_node_type
2016-09-06 21:43:59 -07:00
Max Brunsfeld
38241d466b
Rename .read_fn, .seek_fn -> .read, .seek
2016-09-06 21:39:10 -07:00
Max Brunsfeld
f6da44fdbb
Add ts_node_descendant_for_byte_range
2016-09-06 21:33:19 -07:00
Max Brunsfeld
70756034f1
Allow descendant queries by both 1D and 2D coordinates
2016-09-06 21:17:26 -07:00
Max Brunsfeld
096ac2d4b6
Rename ts_document_set_debugger -> ts_document_set_logger
2016-09-06 17:40:26 -07:00
Max Brunsfeld
64a6c9db0e
Rename ts_document_make -> ts_document_new
2016-09-06 17:26:18 -07:00
Max Brunsfeld
b76574e01c
Handle ambiguities between extra and non-extra tokens using normal GLR splitting
2016-09-06 10:22:16 -07:00
Max Brunsfeld
4f0c83ba01
Move logic for lexical error handling outside of lexer functions
...
This way, less logic needs to be exposed in parser.h
2016-09-03 23:40:57 -07:00
Max Brunsfeld
1c52c30111
Fix unexpected EOF errors getting lost
2016-09-03 22:46:14 -07:00
Max Brunsfeld
8c26d99353
Store error recovery actions in the normal parse table
2016-06-27 14:07:47 -07:00
Max Brunsfeld
43ae8235fd
Remove the error action; a lack of actions implies an error.
2016-06-21 22:53:48 -07:00
Max Brunsfeld
6a7a5cfc3f
Remove nesting in parse action struct
2016-06-21 21:36:33 -07:00
Max Brunsfeld
38c144b4a3
Refine logic for deciding when tokens need to be re-lexed
...
* While generating the lex table, note which tokens can match the
same string. A token needs to be relexed when it has possible
homonyms in the current state.
* Also note which tokens can match substrings of each other tokens.
A token needs to be relexed when there are viable tokens that
could match longer strings in the current state and the next
token has been edited.
* Remove the logic for marking tokens as fragile on creation.
* Store the reusability/non-reusability of symbols off of individual
actions and onto the entire entry for the state & symbol.
2016-06-21 07:28:04 -07:00
Max Brunsfeld
45f7cee0c8
Handle extra tokens properly during error recovery
2016-06-18 20:46:25 -07:00
Max Brunsfeld
94721c7ec0
Rewind and re-tokenize in error mode after detecting an error
2016-06-17 21:26:03 -07:00
Max Brunsfeld
1e353381ff
Don't create error node in lexer unless token is completely invalid
...
Before, any syntax error would cause the lexer to create an error
leaf node. This could happen even with a valid input, if the parse
stack had split and one particular version of the parse stack
failed to parse.
Now, an error leaf node is only created when the lexer cannot understand
part of the input stream at all. When a normal syntax error occurs,
the lexer just returns a token that is outside of the expected token
set, and the parser handles the unexpected token.
2016-05-26 14:15:10 -07:00
Max Brunsfeld
a3679fbb1f
Distinguish separators from main tokens via a property on transitions
...
It was incorrect to store it as a property on the lexical states themselves
2016-05-19 16:27:25 -07:00
Max Brunsfeld
31cc6e6f9c
Remove unused InProgressSymbolEntry typedef
2016-05-16 12:46:29 -07:00
Max Brunsfeld
22c550c9d6
Discard tokens after error detection to find the best repair
...
* Use GLR stack-splitting to try all numbers of tokens to
discard until a repair is found.
* Check the validity of repairs by looking at the child trees,
rather than the statically-computed 'in-progress symbols' list
2016-05-11 13:49:43 -07:00
Max Brunsfeld
9d247e45b2
Deemphasize extra trees in stack debugging graphs
2016-05-01 15:24:50 -07:00
Max Brunsfeld
9ad1e36238
Rename out_of_context_states -> recovery_states
2016-04-27 14:14:56 -07:00
Max Brunsfeld
f63fcffe95
Fix incorrect cast in ts_language_symbol_is_in_progress
2016-04-18 11:17:07 -07:00
Max Brunsfeld
e0c24e3be6
Remove old error recovery code
2016-03-02 20:58:39 -08:00
Max Brunsfeld
c8d7c16f87
Use out-of-context states when in error parse state
2016-03-02 20:56:05 -08:00
Max Brunsfeld
9b2e775b79
Store out-of-context states in the language struct
2016-03-02 20:56:05 -08:00
Max Brunsfeld
ffcd8b5c49
Generate C code for the in-progress symbols in each parse state
2016-03-02 20:56:05 -08:00