tree-sitter

Author	SHA1	Message	Date
Max Brunsfeld	52087de4f0	Remove the concept of fragile reductions They were a vestige of when Tree-sitter did sentential form-based incremental parsing (as opposed to simply state matching). This was elegant but not compatible with GLR as far as I could tell.	2018-03-02 14:51:54 -08:00
Max Brunsfeld	8c29841adf	Represent repetitions with associative structure	2018-02-12 11:41:56 -08:00
Max Brunsfeld	b0fdc33f73	Remove 'extra' and 'structural' booleans from symbol metadata	2017-09-14 12:07:46 -07:00
Max Brunsfeld	99d048e016	Simplify error recovery; eliminate recovery states The previous approach to error recovery relied on special error-recovery states in the parse table. For each token T, there was an error recovery state in which the parser looked for any token that could follow T. Unfortunately, sometimes the set of tokens that could follow T contained conflicts. For example, in JS, the token '}' can be followed by the open-ended 'template_chars' token, but also by ordinary tokens like 'identifier'. So with the old algorithm, when recovering from an unexpected '}' token, the lexer had no way to distinguish identifiers from template_chars. This commit drops the error recovery states. Instead, when we encounter an unexpected token T, we recover from the error by finding a previous state S in the stack in which T would be valid, popping all of the nodes after S, and wrapping them in an error. This way, the lexer is always invoked in a normal parse state, in which it is looking for a non-conflicting set of tokens. Eliminating the error recovery states also shrinks the lex state machine significantly. Signed-off-by: Rick Winfrey <rewinfrey@github.com>	2017-09-11 15:22:52 -07:00
Max Brunsfeld	cb5fe80348	Rename RENAME rule to ALIAS, allow it to create anonymous nodes	2017-07-31 16:41:11 -07:00
Max Brunsfeld	cbdfd89675	Mark reductions as fragile based on their final properties We previously maintained a set of individual productions that were involved in conflicts, but that was subtly incorrect because we don't compare productions themselves when comparing parse items; we only compare the parse items properties that could affect the final reduce actions.	2017-07-21 09:54:24 -07:00
Max Brunsfeld	4649c3a37f	Avoid creating redundant rename sequences	2017-07-18 15:29:06 -07:00
Max Brunsfeld	b3a72954ff	Introduce RENAME rule type	2017-07-13 17:17:22 -07:00
Max Brunsfeld	561821d011	Remove precedence and associativity methods from ParseAction	2017-07-13 13:41:56 -07:00
Max Brunsfeld	2755b07222	Don't store unfinished item signature on ParseStates	2017-07-10 10:47:38 -07:00
Max Brunsfeld	d8e9d04fe7	Add PREC_DYNAMIC rule for resolving runtime ambiguities	2017-07-06 15:24:45 -07:00
Max Brunsfeld	db4b9ebc7c	Implement Rule as a union rather than an abstract base class	2017-03-17 13:29:31 -07:00
Max Brunsfeld	abf8a4f2c2	🎨	2017-03-01 22:15:26 -08:00
Max Brunsfeld	686dc0997c	Avoid introducing certain lexical conflicts during parse state merging The current pretty conservative approach is to avoid merging parse states which would cause a pair tokens to co-exist for the first time in any parse state, where the two tokens can start with the same character and at least one of the tokens can contain a character which is part of the grammar's separators.	2017-02-27 22:54:38 -08:00
Max Brunsfeld	3c8e6f9987	Restructure parse state merging logic * Remove remnants of templatized remove_duplicate_states function * Rename recovery_tokens function to get_compatible_tokens and augment it also compute pairs of tokens which could potentially be incompatible	2017-02-26 12:23:48 -08:00
Max Brunsfeld	c966af0412	Start work on external tokens	2016-12-02 16:24:19 -08:00
Max Brunsfeld	5332fd3418	Fix build warnings	2016-11-19 20:47:43 -08:00
Max Brunsfeld	32387400c6	Rework LR conflict resolution * Unify precedence/associativity-based resolution with the search for a whitelisted conflict * Improve conflict error messages	2016-11-18 13:50:55 -08:00
Max Brunsfeld	6cfd009503	Compute parse state group signature based on the item set	2016-11-16 10:21:30 -08:00
Max Brunsfeld	42d37656ea	Optimize remove_duplicate_parse_states method Signed-off-by: Nathan Sobo <nathan@github.com>	2016-11-15 17:51:52 -08:00
Max Brunsfeld	1118a9142a	Introduce Symbol::Index type alias	2016-11-14 10:25:26 -08:00
Max Brunsfeld	a89f8c086b	Remove stray #include	2016-11-14 09:31:32 -08:00
Max Brunsfeld	fad7294ba4	Store shift states for non-terminals directly in the main parse table	2016-11-14 08:36:06 -08:00
Max Brunsfeld	255bc2427c	🎨 build_parse_table	2016-11-09 20:47:47 -08:00
Timothy Clem	14bae584d4	WIP: New check for mergable symbols in merge_state	2016-10-18 13:03:41 -07:00
Max Brunsfeld	b76574e01c	Handle ambiguities between extra and non-extra tokens using normal GLR splitting	2016-09-06 10:22:16 -07:00
Max Brunsfeld	0e2bbbd7ee	Compress parse table by allowing reductions w/ unexpected lookaheads	2016-07-04 12:20:23 -07:00
Max Brunsfeld	8c26d99353	Store error recovery actions in the normal parse table	2016-06-27 14:07:47 -07:00
Max Brunsfeld	38c144b4a3	Refine logic for deciding when tokens need to be re-lexed * While generating the lex table, note which tokens can match the same string. A token needs to be relexed when it has possible homonyms in the current state. * Also note which tokens can match substrings of each other tokens. A token needs to be relexed when there are viable tokens that could match longer strings in the current state and the next token has been edited. * Remove the logic for marking tokens as fragile on creation. * Store the reusability/non-reusability of symbols off of individual actions and onto the entire entry for the state & symbol.	2016-06-21 07:28:04 -07:00
Max Brunsfeld	22c550c9d6	Discard tokens after error detection to find the best repair * Use GLR stack-splitting to try all numbers of tokens to discard until a repair is found. * Check the validity of repairs by looking at the child trees, rather than the statically-computed 'in-progress symbols' list	2016-05-11 13:49:43 -07:00
Max Brunsfeld	31f6b2e24a	Refactor construction of out-of-context states	2016-04-25 21:59:40 -07:00
Max Brunsfeld	ffcd8b5c49	Generate C code for the in-progress symbols in each parse state	2016-03-02 20:56:05 -08:00
Max Brunsfeld	00d953f507	Generate C code for out-of-context states	2016-03-02 20:56:05 -08:00
Max Brunsfeld	6401a065ae	Use different types for advance and accept-token actions Unlike with parse actions, lexical actions of different types never appear in the same places in the table	2016-01-22 22:24:11 -07:00
Max Brunsfeld	386b124866	Ensure that there are no duplicate lex states	2015-12-20 15:46:13 -08:00
Max Brunsfeld	c495076adb	Record in parse table which actions can hide splits Suppose a parse state S has multiple actions for a terminal lookahead symbol A. Then during incremental parsing, while in state S, the parser should not reuse a non-terminal lookahead B where FIRST(B) contains A, because reusing B might prematurely discard one of the possible actions that a batch parser would have attempted in state S, upon seeing A as a lookahead.	2015-12-17 13:11:56 -08:00
Max Brunsfeld	75f31a79a3	Treat reduce actions with different production IDs as distinct	2015-12-10 13:00:26 -08:00
Max Brunsfeld	ad619d95f6	Add 'extra' field to symbol metadata This stores whether a symbol is only ever used as a ubiquitous token. This will allow ubiquitous nodes to be reused more effectively: if they are always ubiquitous, then they can be reused immediately, and otherwise, they must be broken down in case they need to be used structurally.	2015-12-02 15:10:24 -08:00
Max Brunsfeld	1983bcfb60	Fix conflation of finished items w/ different precedence	2015-10-18 12:51:32 -07:00
Max Brunsfeld	9959fe35b0	Allow associativity to be specified in rules w/o precedence	2015-10-13 11:25:28 -07:00
Max Brunsfeld	6d748a6714	Store parse actions' precedences as ranges, not sets	2015-10-05 16:05:19 -07:00
Max Brunsfeld	ebc52f109d	Merge branch 'flatten-rules-into-productions' This branch had diverged considerably, so merging it required changing a lot of code. Conflicts: project.gyp spec/compiler/build_tables/action_takes_precedence_spec.cc spec/compiler/build_tables/build_conflict_spec.cc spec/compiler/build_tables/build_parse_table_spec.cc spec/compiler/build_tables/first_symbols_spec.cc spec/compiler/build_tables/item_set_closure_spec.cc spec/compiler/build_tables/item_set_transitions_spec.cc spec/compiler/build_tables/rule_can_be_blank_spec.cc spec/compiler/helpers/containers.h spec/compiler/prepare_grammar/expand_repeats_spec.cc spec/compiler/prepare_grammar/extract_tokens_spec.cc src/compiler/build_tables/action_takes_precedence.h src/compiler/build_tables/build_parse_table.cc src/compiler/build_tables/first_symbols.cc src/compiler/build_tables/first_symbols.h src/compiler/build_tables/item_set_closure.cc src/compiler/build_tables/item_set_transitions.cc src/compiler/build_tables/parse_item.cc src/compiler/build_tables/parse_item.h src/compiler/build_tables/rule_can_be_blank.cc src/compiler/build_tables/rule_can_be_blank.h src/compiler/prepare_grammar/expand_repeats.cc src/compiler/prepare_grammar/extract_tokens.cc src/compiler/prepare_grammar/extract_tokens.h src/compiler/prepare_grammar/prepare_grammar.cc src/compiler/rules/built_in_symbols.cc src/compiler/rules/built_in_symbols.h src/compiler/syntax_grammar.cc src/compiler/syntax_grammar.h	2015-10-02 23:46:39 -07:00
Max Brunsfeld	e6f3239bef	Move stream operator definitions to spec helpers This is one less thing for users to worry about when compiling and linking the library itself	2015-09-10 10:12:11 -07:00
Max Brunsfeld	bd77ab1ac9	Move public rule functions out of rule namespace This way, there's only one public namespace: tree_sitter	2015-09-03 17:49:20 -07:00
Max Brunsfeld	f9b057f3a9	clang-format everything	2015-07-27 18:29:48 -07:00
Max Brunsfeld	aabcb10cfb	Respect expected_conflicts field when building parse table	2015-06-28 16:22:31 -05:00
Max Brunsfeld	80ec303b10	Replace prec rule w/ left_assoc and right_assoc Consider shift/reduce conflicts to be compilation errors unless they are resolved by a specified associativity.	2015-03-16 23:12:34 -07:00
Max Brunsfeld	3458fa6e50	Fix non-deterministic order in conflict description	2015-03-07 11:02:21 -08:00
Max Brunsfeld	2d436cf141	Identify fragile reductions at compile time	2015-02-21 15:11:03 -08:00
Max Brunsfeld	98cc2f2264	Auto-format all source code with clang-format	2014-07-21 13:20:00 -07:00

1 2

77 commits