tree-sitter

Author	SHA1	Message	Date
Max Brunsfeld	abf8a4f2c2	🎨	2017-03-01 22:15:26 -08:00
Max Brunsfeld	686dc0997c	Avoid introducing certain lexical conflicts during parse state merging The current pretty conservative approach is to avoid merging parse states which would cause a pair tokens to co-exist for the first time in any parse state, where the two tokens can start with the same character and at least one of the tokens can contain a character which is part of the grammar's separators.	2017-02-27 22:54:38 -08:00
Max Brunsfeld	3c8e6f9987	Restructure parse state merging logic * Remove remnants of templatized remove_duplicate_states function * Rename recovery_tokens function to get_compatible_tokens and augment it also compute pairs of tokens which could potentially be incompatible	2017-02-26 12:23:48 -08:00
Timothy Clem	ab00f1b0da	Add support for \W and \D negated character classes too	2017-01-31 15:03:48 -08:00
Timothy Clem	902b7f9745	Allow \S for negated whitespace regex shorthand	2017-01-31 14:45:28 -08:00
Max Brunsfeld	4131e1c16e	Return an error when external token name matches non-terminal rule	2017-01-31 11:36:51 -08:00
Max Brunsfeld	42c41c158c	Refactor logic for handling shared internal/external tokens	2016-12-21 10:49:55 -08:00
Max Brunsfeld	a09409900f	Silence missing intializer warnings in compiler unit tests	2016-12-05 16:37:06 -08:00
Max Brunsfeld	0f8e130687	Call external scanner functions when lexing	2016-12-02 22:03:48 -08:00
Max Brunsfeld	c966af0412	Start work on external tokens	2016-12-02 16:24:19 -08:00
Max Brunsfeld	6cf4ccb840	Represent rule metadata as a struct, not a map	2016-11-19 13:59:34 -08:00
Max Brunsfeld	32387400c6	Rework LR conflict resolution * Unify precedence/associativity-based resolution with the search for a whitelisted conflict * Improve conflict error messages	2016-11-18 13:50:55 -08:00
Max Brunsfeld	1118a9142a	Introduce Symbol::Index type alias	2016-11-14 10:25:26 -08:00
Max Brunsfeld	fad7294ba4	Store shift states for non-terminals directly in the main parse table	2016-11-14 08:36:06 -08:00
Max Brunsfeld	8d9c261e3a	Don't include reduce actions for nonterminal lookaheads	2016-11-10 11:33:37 -08:00
Max Brunsfeld	7bcae8f6a8	🎨 flatten_grammar	2016-11-09 20:29:21 -08:00
Timothy Clem	693c6d40dd	Move setup of mergeable_symbols to constructor, use set throughout	2016-10-18 15:18:33 -07:00
Max Brunsfeld	b76574e01c	Handle ambiguities between extra and non-extra tokens using normal GLR splitting	2016-09-06 10:22:16 -07:00
Max Brunsfeld	38c144b4a3	Refine logic for deciding when tokens need to be re-lexed * While generating the lex table, note which tokens can match the same string. A token needs to be relexed when it has possible homonyms in the current state. * Also note which tokens can match substrings of each other tokens. A token needs to be relexed when there are viable tokens that could match longer strings in the current state and the next token has been edited. * Remove the logic for marking tokens as fragile on creation. * Store the reusability/non-reusability of symbols off of individual actions and onto the entire entry for the state & symbol.	2016-06-21 07:28:04 -07:00
Max Brunsfeld	a3679fbb1f	Distinguish separators from main tokens via a property on transitions It was incorrect to store it as a property on the lexical states themselves	2016-05-19 16:27:25 -07:00
Max Brunsfeld	59712ec492	Clean up lex table generation	2016-05-19 13:25:46 -07:00
Max Brunsfeld	22c550c9d6	Discard tokens after error detection to find the best repair * Use GLR stack-splitting to try all numbers of tokens to discard until a repair is found. * Check the validity of repairs by looking at the child trees, rather than the statically-computed 'in-progress symbols' list	2016-05-11 13:49:43 -07:00
Max Brunsfeld	5b74813a5c	Refine logic for which tokens to use in error recovery	2016-04-27 14:09:19 -07:00
Max Brunsfeld	cf19b2e58d	Make repeat rules left-recursive instead of right recursive	2016-04-18 12:40:14 -07:00
Max Brunsfeld	76d072545d	Include out-of-context states starting with non-terminals	2016-03-02 20:58:39 -08:00
Max Brunsfeld	8c01b70ce7	Don't skip tokens that are not the start of any non-terminal	2016-03-02 20:56:05 -08:00
Max Brunsfeld	dee1f697c1	Compute the set of variables that can begin with each terminal symbol	2016-02-25 21:51:52 -08:00
Max Brunsfeld	6401a065ae	Use different types for advance and accept-token actions Unlike with parse actions, lexical actions of different types never appear in the same places in the table	2016-01-22 22:24:11 -07:00
Max Brunsfeld	0f7dbea9a3	Unify test targets, use externally defined languages as fixtures	2016-01-15 11:19:24 -08:00
Max Brunsfeld	ad4089a4bf	Move anonymous tokens grammar into integration spec	2016-01-14 10:35:03 -08:00
Max Brunsfeld	4a5deda071	Add tests that compile a grammar and use its parser	2016-01-14 10:11:30 -08:00
Max Brunsfeld	d4632ab9a9	Make the compile function plain C and take a JSON grammar	2016-01-11 12:33:48 -08:00
Max Brunsfeld	36870bfced	Make Grammar a simple struct	2016-01-08 15:51:30 -08:00
Max Brunsfeld	1c6ad5f7e4	Rename ubiquitous_tokens -> extra_tokens in compiler API They were already called this in the runtime code. 'Extra' is just easier to say.	2015-12-17 15:50:50 -08:00
Max Brunsfeld	f065eb0480	Remove unused parameter to LexConflictManager	2015-12-17 15:45:47 -08:00
Max Brunsfeld	a8d2585330	Fix resolution of shift-extra vs reduce actions	2015-12-17 15:19:58 -08:00
Max Brunsfeld	351b4f4aaa	Remove unused parameters to ParseConflictManager	2015-12-17 15:19:00 -08:00
Max Brunsfeld	c495076adb	Record in parse table which actions can hide splits Suppose a parse state S has multiple actions for a terminal lookahead symbol A. Then during incremental parsing, while in state S, the parser should not reuse a non-terminal lookahead B where FIRST(B) contains A, because reusing B might prematurely discard one of the possible actions that a batch parser would have attempted in state S, upon seeing A as a lookahead.	2015-12-17 13:11:56 -08:00
Max Brunsfeld	d713054d61	Record which tokens are fragile when lexing	2015-12-10 21:05:54 -08:00
Max Brunsfeld	75f31a79a3	Treat reduce actions with different production IDs as distinct	2015-12-10 13:00:26 -08:00
Max Brunsfeld	e11515fb74	Escape backslashes and quotes in symbol name strings	2015-11-09 09:33:24 -08:00
Max Brunsfeld	d5ce268074	Fix handling of changing precedence within lexical rules. A precedence annotation wrapping a sequence of characters now only affects how tightly those characters bind to each other, not how tightly they bind to the preceding character. This bug surfaced because a generated lexer was failing to recognize a '\n' character as a token, instead treating it as ubiquitous whitespace. It made this error because, even though anonymous ubiquitous tokens have the lowest precedence, the character immediately after the '\n' was part of a normal token, which had normal precedence (0). Advancing into that following token was incorrectly prioritized above accepting the line-break token.	2015-11-08 13:36:15 -08:00
Max Brunsfeld	d7cb48aae7	Fix handling of precedence for repeat rules	2015-11-01 21:00:44 -08:00
Max Brunsfeld	d6ee28abd0	Make precedence more useful within tokens Choose accept-token actions over advance actions if their rule has a higher precedence.	2015-11-01 12:48:27 -08:00
Max Brunsfeld	998ae533da	Make completion_status() a method on LexItem	2015-10-30 16:48:37 -07:00
Max Brunsfeld	c8be143f65	🔥 get_metadata function	2015-10-30 16:22:25 -07:00
Max Brunsfeld	73b3280fbb	Include precedence calculation in LexItemSet::transitions	2015-10-30 16:07:29 -07:00
Max Brunsfeld	e9be0ff24e	Make completion_status() a method on ParseItem	2015-10-30 14:07:33 -07:00
Max Brunsfeld	4850384b78	Include precedence calculation in ParseItemSet::transitions	2015-10-30 13:54:11 -07:00
Max Brunsfeld	433f060a5b	Fix stream overloads for inspecting PrecedenceRange and ParseItem	2015-10-30 10:45:46 -07:00

1 2 3 4 5 ...

278 commits