tree-sitter

Author	SHA1	Message	Date
Max Brunsfeld	e56d17a806	Fix symbol type for simple anonymous aliases	2018-08-30 12:40:27 -07:00
Max Brunsfeld	5372a81947	Simplify treatment of rules that are always aliased one way	2018-08-30 09:57:22 -07:00
Max Brunsfeld	126f84aa73	Avoid unnecessary suffixes on external symbol identifiers	2018-08-01 16:11:21 -07:00
Max Brunsfeld	cb784975a4	Add IMMEDIATE_TOKEN rule type, for enforcing no preceding extras	2018-08-01 14:00:57 -07:00
Max Brunsfeld	e917756ad1	Remove depends_on_lookahead field from parse table entries This simplifies the logic for determining whether a token is reusable and makes it more conservative. It should fix some incremental parsing bugs that are being caught by the randomized tests on CI.	2018-03-28 10:58:33 -07:00
Max Brunsfeld	c0cc35ff07	Create separate lexer function for keywords	2018-03-07 12:00:26 -08:00
Max Brunsfeld	52087de4f0	Remove the concept of fragile reductions They were a vestige of when Tree-sitter did sentential form-based incremental parsing (as opposed to simply state matching). This was elegant but not compatible with GLR as far as I could tell.	2018-03-02 14:51:54 -08:00
Max Brunsfeld	8c29841adf	Represent repetitions with associative structure	2018-02-12 11:41:56 -08:00
Max Brunsfeld	b0fdc33f73	Remove 'extra' and 'structural' booleans from symbol metadata	2017-09-14 12:07:46 -07:00
Max Brunsfeld	99d048e016	Simplify error recovery; eliminate recovery states The previous approach to error recovery relied on special error-recovery states in the parse table. For each token T, there was an error recovery state in which the parser looked for any token that could follow T. Unfortunately, sometimes the set of tokens that could follow T contained conflicts. For example, in JS, the token '}' can be followed by the open-ended 'template_chars' token, but also by ordinary tokens like 'identifier'. So with the old algorithm, when recovering from an unexpected '}' token, the lexer had no way to distinguish identifiers from template_chars. This commit drops the error recovery states. Instead, when we encounter an unexpected token T, we recover from the error by finding a previous state S in the stack in which T would be valid, popping all of the nodes after S, and wrapping them in an error. This way, the lexer is always invoked in a normal parse state, in which it is looking for a non-conflicting set of tokens. Eliminating the error recovery states also shrinks the lex state machine significantly. Signed-off-by: Rick Winfrey <rewinfrey@github.com>	2017-09-11 15:22:52 -07:00
Max Brunsfeld	eace426129	Suppress unknown pragma warnings in MSVC	2017-08-09 10:14:05 -07:00
Max Brunsfeld	e6b43700b9	Get generated parsers compiling and loading properly on windows	2017-08-08 16:47:51 -07:00
Max Brunsfeld	cb5fe80348	Rename RENAME rule to ALIAS, allow it to create anonymous nodes	2017-07-31 16:41:11 -07:00
Max Brunsfeld	4649c3a37f	Avoid creating redundant rename sequences	2017-07-18 15:29:06 -07:00
Max Brunsfeld	afb499bf2e	Handle rename symbols in ts_language APIs	2017-07-18 12:01:52 -07:00
Max Brunsfeld	9a04231ab1	Remove length restriction in external scanner serialization API	2017-07-17 17:12:36 -07:00
Max Brunsfeld	66dc12587a	Call the external scanner whenever an external token is valid For some reason, there was previously some extra logic that prevented the external scanner from being invoked if the only valid external token also had an internal definition. It's surprising to not call the external scanner if an external token is valid.	2017-07-17 10:28:59 -07:00
Max Brunsfeld	b3a72954ff	Introduce RENAME rule type	2017-07-13 17:17:22 -07:00
Max Brunsfeld	59236d2ed1	Avoid redundant character comparisons in generated lex function	2017-07-10 14:09:31 -07:00
Max Brunsfeld	d8e9d04fe7	Add PREC_DYNAMIC rule for resolving runtime ambiguities	2017-07-06 15:24:45 -07:00
joshvera	f76935cc7e	just make it static	2017-03-24 18:38:21 -04:00
joshvera	6938b288a5	Make external scanner symbol map unique	2017-03-24 14:51:37 -04:00
Max Brunsfeld	ed8fbff175	Allow anonymous tokens to be used in grammars' external token lists	2017-03-17 16:31:29 -07:00
Max Brunsfeld	db4b9ebc7c	Implement Rule as a union rather than an abstract base class	2017-03-17 13:29:31 -07:00
Max Brunsfeld	d222dbb9fd	Allow lexer to accept tokens that ended at previous positions * Track lookahead in each tree * Add 'mark_end' API that external scanners can use	2017-03-13 17:06:52 -07:00
Max Brunsfeld	f04d7c5860	Handle unused tokens	2017-03-09 21:16:37 -08:00
Max Brunsfeld	abf8a4f2c2	🎨	2017-03-01 22:15:26 -08:00
Max Brunsfeld	686dc0997c	Avoid introducing certain lexical conflicts during parse state merging The current pretty conservative approach is to avoid merging parse states which would cause a pair tokens to co-exist for the first time in any parse state, where the two tokens can start with the same character and at least one of the tokens can contain a character which is part of the grammar's separators.	2017-02-27 22:54:38 -08:00
Max Brunsfeld	0a6e5f9ee6	Fix some build warnings on gcc	2017-01-31 11:46:28 -08:00
Max Brunsfeld	60f6998485	Rename generated language functions to e.g. `tree_sitter_python` They used to be called e.g. `ts_language_python`. Now that there are APIs that deal with the `TSLanguage` objects themselves, such as `ts_language_symbol_count`, the old names were a little confusing.	2017-01-31 10:29:31 -08:00
Max Brunsfeld	d853b6504d	Add version number to TSLanguage structs	2017-01-31 10:21:47 -08:00
Max Brunsfeld	3706678b89	Pass const TSExternalTokenState to external scanner deserialize hook	2016-12-21 13:58:18 -08:00
Max Brunsfeld	34a65f588d	Tweak naming and organization of external-scanner related language fields	2016-12-21 11:24:41 -08:00
Max Brunsfeld	42c41c158c	Refactor logic for handling shared internal/external tokens	2016-12-21 10:49:55 -08:00
Max Brunsfeld	2b3da512a4	Add serialize, deserialize and reset callbacks to external scanners Signed-off-by: Nathan Sobo <nathan@github.com>	2016-12-20 13:12:01 -08:00
Max Brunsfeld	10b51a05a1	Allow external scanners to refer to (and return) internally-defined tokens Tokens that are defined in the grammar's rules may now be included in the externals list also, so that external scanners can check if they are valid lookaheads or not, and if so, can return them to the parser if needed.	2016-12-09 13:32:58 -08:00
Max Brunsfeld	83514293b5	Allow external tokens to be either visible or hidden	2016-12-05 17:26:11 -08:00
Max Brunsfeld	1251ff2e30	Consider externals to be named, not anonymous	2016-12-05 17:09:22 -08:00
Max Brunsfeld	0f8e130687	Call external scanner functions when lexing	2016-12-02 22:03:48 -08:00
Max Brunsfeld	c966af0412	Start work on external tokens	2016-12-02 16:24:19 -08:00
Max Brunsfeld	32387400c6	Rework LR conflict resolution * Unify precedence/associativity-based resolution with the search for a whitelisted conflict * Improve conflict error messages	2016-11-18 13:50:55 -08:00
Max Brunsfeld	fad7294ba4	Store shift states for non-terminals directly in the main parse table	2016-11-14 08:36:06 -08:00
Max Brunsfeld	e149d94ff5	Remove generated parsers' dependency on runtime.h	2016-10-05 14:02:49 -07:00
Max Brunsfeld	b76574e01c	Handle ambiguities between extra and non-extra tokens using normal GLR splitting	2016-09-06 10:22:16 -07:00
Max Brunsfeld	1c52c30111	Fix unexpected EOF errors getting lost	2016-09-03 22:46:14 -07:00
Max Brunsfeld	4182de2975	Include each symbol's numeric value in generated code Sometimes these are useful for debugging	2016-08-26 17:40:22 -07:00
Max Brunsfeld	1c66d90203	Mark repeat symbols as anonymous	2016-07-17 10:44:08 -07:00
Max Brunsfeld	fa8993460e	Don't reuse unexpected tokens for now	2016-07-17 07:25:13 -07:00
Max Brunsfeld	8c26d99353	Store error recovery actions in the normal parse table	2016-06-27 14:07:47 -07:00
Max Brunsfeld	43ae8235fd	Remove the error action; a lack of actions implies an error.	2016-06-21 22:53:48 -07:00

1 2 3 4

162 commits