tree-sitter

Author	SHA1	Message	Date
Max Brunsfeld	99d048e016	Simplify error recovery; eliminate recovery states The previous approach to error recovery relied on special error-recovery states in the parse table. For each token T, there was an error recovery state in which the parser looked for any token that could follow T. Unfortunately, sometimes the set of tokens that could follow T contained conflicts. For example, in JS, the token '}' can be followed by the open-ended 'template_chars' token, but also by ordinary tokens like 'identifier'. So with the old algorithm, when recovering from an unexpected '}' token, the lexer had no way to distinguish identifiers from template_chars. This commit drops the error recovery states. Instead, when we encounter an unexpected token T, we recover from the error by finding a previous state S in the stack in which T would be valid, popping all of the nodes after S, and wrapping them in an error. This way, the lexer is always invoked in a normal parse state, in which it is looking for a non-conflicting set of tokens. Eliminating the error recovery states also shrinks the lex state machine significantly. Signed-off-by: Rick Winfrey <rewinfrey@github.com>	2017-09-11 15:22:52 -07:00
Max Brunsfeld	4c9c05806a	Merge compatible starting token states before constructing lex table	2017-09-05 13:21:53 -07:00
Max Brunsfeld	9d668c5004	Move incompatible token map into LexTableBuilder	2017-08-31 15:46:37 -07:00
Max Brunsfeld	f8649824fa	Remove unused function	2017-08-31 15:30:44 -07:00
Max Brunsfeld	c285fbef38	Clear LexTableBuilder's state after detecting conflicts	2017-08-25 17:11:39 -07:00
Max Brunsfeld	573b5f3671	Pass LexTableBuilder to ParseTableBuilder	2017-08-25 15:57:50 -07:00
Max Brunsfeld	eace426129	Suppress unknown pragma warnings in MSVC	2017-08-09 10:14:05 -07:00
Max Brunsfeld	964dd16812	Avoid unicode escape sequences when generating conflict messages	2017-08-09 09:32:58 -07:00
Max Brunsfeld	5f40adb70c	Recur to sub-rules in a deterministic order in expand_repeats	2017-08-08 17:20:04 -07:00
Max Brunsfeld	e6b43700b9	Get generated parsers compiling and loading properly on windows	2017-08-08 16:47:51 -07:00
Max Brunsfeld	9d616b3bf8	Replace size_t -> LexStateId in LexTableBuilder::remove_duplicate_states	2017-08-08 12:55:35 -07:00
Max Brunsfeld	947c161c2f	Use a constructor rather than aggregate initialization for Production	2017-08-08 10:41:54 -07:00
Max Brunsfeld	e932d09908	Avoid aggregate initialization syntax in places where C++11 doesn't allow it	2017-08-07 13:07:54 -07:00
Max Brunsfeld	bf31c19d03	Avoid initializing production vectors via initializer lists	2017-08-07 12:45:37 -07:00
Max Brunsfeld	94dc703bfc	Require that grammars' start rules be visible	2017-08-04 17:07:37 -07:00
Max Brunsfeld	255f7af24b	Name ParseTableBuilder fields more consistently	2017-08-04 09:47:24 -07:00
Max Brunsfeld	84e4114f79	Allow conflicts involving repeat rules to be whitelisted, via their parent rule	2017-08-03 15:18:29 -07:00
Max Brunsfeld	119c67dd78	Fix conflict reporting for shift/reduce conflicts w/ multiple reductions We were failing to rule out shift actions with lower precedence. Signed-off-by: Philip Turnbull <philipturnbull@github.com>	2017-08-02 15:13:30 -07:00
Max Brunsfeld	cb5fe80348	Rename RENAME rule to ALIAS, allow it to create anonymous nodes	2017-07-31 16:41:11 -07:00
Max Brunsfeld	b5f421cafb	Fix name collision that gcc didn't tolerate	2017-07-21 16:28:39 -07:00
Max Brunsfeld	cbdfd89675	Mark reductions as fragile based on their final properties We previously maintained a set of individual productions that were involved in conflicts, but that was subtly incorrect because we don't compare productions themselves when comparing parse items; we only compare the parse items properties that could affect the final reduce actions.	2017-07-21 09:54:24 -07:00
Max Brunsfeld	7d9d8bce79	Handle inlined rules that contain other inlined rules	2017-07-20 15:29:06 -07:00
Max Brunsfeld	4649c3a37f	Avoid creating redundant rename sequences	2017-07-18 15:29:06 -07:00
Max Brunsfeld	afb499bf2e	Handle rename symbols in ts_language APIs	2017-07-18 12:01:52 -07:00
Max Brunsfeld	9a04231ab1	Remove length restriction in external scanner serialization API	2017-07-17 17:12:36 -07:00
Max Brunsfeld	66dc12587a	Call the external scanner whenever an external token is valid For some reason, there was previously some extra logic that prevented the external scanner from being invoked if the only valid external token also had an internal definition. It's surprising to not call the external scanner if an external token is valid.	2017-07-17 10:28:59 -07:00
Max Brunsfeld	b3a72954ff	Introduce RENAME rule type	2017-07-13 17:17:22 -07:00
Max Brunsfeld	0b94e9d814	Don't include preceding production steps in ParseItem hash	2017-07-13 13:42:28 -07:00
Max Brunsfeld	561821d011	Remove precedence and associativity methods from ParseAction	2017-07-13 13:41:56 -07:00
Max Brunsfeld	d646889922	Simplify flatten_rule function	2017-07-13 09:59:23 -07:00
Max Brunsfeld	7293e6f0cc	Fix compile warnings	2017-07-12 22:08:36 -07:00
Max Brunsfeld	62c577af33	Remove unnecessary using statements	2017-07-12 21:41:37 -07:00
Max Brunsfeld	a3006bc2b5	Represent LookaheadSet using vectors of bool	2017-07-12 16:02:01 -07:00
Max Brunsfeld	65bf1389e1	Add a way to automatically inline rules	2017-07-11 23:13:44 -07:00
Max Brunsfeld	26a25278cd	When comparing parse items, ignore consumed part of their productions This speeds up parser generation by increasing the likelihood that we'll recognize parse item sets as equivalent in advance, rather than having to merge their states after the fact.	2017-07-11 17:30:32 -07:00
Max Brunsfeld	a199b217f3	Optimize ParseTableBuilder for non-terminals w/ many productions	2017-07-11 12:54:29 -07:00
Max Brunsfeld	68c3ba1b8b	🎨 merge_parse_state	2017-07-10 16:46:11 -07:00
Max Brunsfeld	5bd5b4bb05	Replace <cctype> -> <cwctype>	2017-07-10 14:35:14 -07:00
Max Brunsfeld	59236d2ed1	Avoid redundant character comparisons in generated lex function	2017-07-10 14:09:31 -07:00
Max Brunsfeld	2755b07222	Don't store unfinished item signature on ParseStates	2017-07-10 10:47:38 -07:00
Max Brunsfeld	1586d70cbe	Compute conflicting tokens more precisely While generating the parse table, keep track of which tokens can follow one another. Then use this information to evaluate token conflicts more precisely. This will result in a smaller parse table than the previous, overly-conservative approach.	2017-07-07 17:54:24 -07:00
Max Brunsfeld	a98abde529	Provide all preceding symbols as context when reporting conflicts	2017-07-07 14:52:56 -07:00
Max Brunsfeld	c91ceaaa8d	🎨 build_parse_table	2017-07-07 14:52:45 -07:00
Max Brunsfeld	0de93b3bf2	Allow negative dynamic precedences	2017-07-06 22:21:59 -07:00
Max Brunsfeld	d8e9d04fe7	Add PREC_DYNAMIC rule for resolving runtime ambiguities	2017-07-06 15:24:45 -07:00
Max Brunsfeld	20982fdcb9	Mark tokens as non-reusable in states where shorter takes take precedence This fixes some randomized test failures in the C grammar, relating to Object-like macros. The object-like macro rule relies on a whitespace token in order to distinguish object-like macros whose values begin with a '(' from function-like macros. The presence of that whitespace token means that other nodes should not be reusable in that state.	2017-06-22 16:04:42 -07:00
Max Brunsfeld	8517313a45	🎨	2017-06-22 15:33:07 -07:00
Max Brunsfeld	8157b81b68	Improve logic for short-circuiting trivial lexing conflict detection	2017-06-22 15:33:01 -07:00
Max Brunsfeld	2c043803f1	Be more conservative about avoiding lexing conflicts when merging states This fixes a bug in the C++ grammar where the `>>` token was merged into a state where it was previously not valid, but the `>` token was valid. This caused nested templates like - std::vector<std::pair<int, int>> to not parse correctly.	2017-06-22 15:32:13 -07:00
Phil Turnbull	fdd8792ebc	Correctly set is_first From scan-build: Value stored to 'is_first' is never read	2017-06-14 11:12:06 -04:00

1 2 3 4 5 ...

573 commits