tree-sitter

Author	SHA1	Message	Date
Max Brunsfeld	87ca3cb099	Reuse nodes based on state matching, not sentential form validity I think that state matching is the only correct strategy for incremental node reuse that is compatible with the new error recovery algorithm. It's also simpler than the sentential-form algorithm. With the compressed parse tables, state matching shouldn't be too conservative of a test.	2016-07-31 21:31:19 -07:00
Max Brunsfeld	285f2272fd	Move random string helpers into a separate file	2016-07-17 06:22:05 -07:00
Max Brunsfeld	c3a242740b	Allow lookahead to be broken down further after performing reductions	2016-07-04 12:20:23 -07:00
Max Brunsfeld	8c26d99353	Store error recovery actions in the normal parse table	2016-06-27 14:07:47 -07:00
Max Brunsfeld	08e47001f1	Silence mismatched delete warning in spec helper	2016-06-27 13:38:49 -07:00
Max Brunsfeld	9538b5b879	Don't count extra trees toward stack versions' error costs	2016-06-26 22:46:40 -07:00
Max Brunsfeld	6fd3edceae	Fix logic for inserting leading & trailing extras into root node on acceptance	2016-06-26 11:57:42 -07:00
Max Brunsfeld	9972709e43	Allow error recovery to skip non-terminal nodes after error detection	2016-06-24 10:28:05 -07:00
Max Brunsfeld	09b019c530	Fix test for invalid blank input	2016-06-23 09:24:26 -07:00
Max Brunsfeld	c6e9b32d3f	Print all the same parse log messages for both debugging methods	2016-06-22 22:36:11 -07:00
Max Brunsfeld	f425fbad18	Break down reused node on stack whenever lookahead can't be reused	2016-06-22 22:03:27 -07:00
Max Brunsfeld	38c144b4a3	Refine logic for deciding when tokens need to be re-lexed * While generating the lex table, note which tokens can match the same string. A token needs to be relexed when it has possible homonyms in the current state. * Also note which tokens can match substrings of each other tokens. A token needs to be relexed when there are viable tokens that could match longer strings in the current state and the next token has been edited. * Remove the logic for marking tokens as fragile on creation. * Store the reusability/non-reusability of symbols off of individual actions and onto the entire entry for the state & symbol.	2016-06-21 07:28:04 -07:00
Max Brunsfeld	773e50f26b	Update error recovery specs to reflect slightly different recoveries	2016-06-18 20:46:16 -07:00
Max Brunsfeld	94721c7ec0	Rewind and re-tokenize in error mode after detecting an error	2016-06-17 21:26:03 -07:00
Max Brunsfeld	70d3cde775	Remove extra leading newline from corpus spec texts	2016-06-15 10:31:34 -07:00
Max Brunsfeld	ecc7399ed3	Fix stack breakdown procedure when there are trailing extra tokens	2016-06-14 20:25:33 -07:00
Max Brunsfeld	e70547cd11	Allow recoveries that skip leading children of invisible trees Before this, errors could only be recovered by skipping internal children.	2016-06-14 14:48:35 -07:00
Max Brunsfeld	00a0939504	Abort erroneous parse versions more eagerly	2016-06-02 14:04:48 -07:00
Max Brunsfeld	9b67b21dcd	Fix an outdated error corpus entry	2016-06-02 14:04:10 -07:00
Max Brunsfeld	ea47fdc0fe	Rework logic for when to abandon parses with errors	2016-05-29 22:36:47 -07:00
Max Brunsfeld	6535704870	Replace stack_merge_new function with two simpler functions - merge(version1, version2) - split(version)	2016-05-28 21:22:10 -07:00
Max Brunsfeld	e686478ad2	Rename stack_merge function to stack_merge_all	2016-05-28 20:24:08 -07:00
Max Brunsfeld	e1a3a1daeb	Import error corpus entries from grammar repos Now that error recovery requires no input for the grammar author, it shouldn't be tested in the individual grammar repos.	2016-05-28 20:12:02 -07:00
Max Brunsfeld	1e353381ff	Don't create error node in lexer unless token is completely invalid Before, any syntax error would cause the lexer to create an error leaf node. This could happen even with a valid input, if the parse stack had split and one particular version of the parse stack failed to parse. Now, an error leaf node is only created when the lexer cannot understand part of the input stream at all. When a normal syntax error occurs, the lexer just returns a token that is outside of the expected token set, and the parser handles the unexpected token.	2016-05-26 14:15:10 -07:00
Max Brunsfeld	a3679fbb1f	Distinguish separators from main tokens via a property on transitions It was incorrect to store it as a property on the lexical states themselves	2016-05-19 16:27:25 -07:00
Max Brunsfeld	59712ec492	Clean up lex table generation	2016-05-19 13:25:46 -07:00
Max Brunsfeld	88053cf723	In tests, don’t record allocations while printing debug graphs	2016-05-16 10:44:19 -07:00
Max Brunsfeld	22c550c9d6	Discard tokens after error detection to find the best repair * Use GLR stack-splitting to try all numbers of tokens to discard until a repair is found. * Check the validity of repairs by looking at the child trees, rather than the statically-computed 'in-progress symbols' list	2016-05-11 13:49:43 -07:00
Max Brunsfeld	5b74813a5c	Refine logic for which tokens to use in error recovery	2016-04-27 14:09:19 -07:00
Max Brunsfeld	fd4c33209e	Select ambiguous alternatives by minimizing error size	2016-04-24 00:54:20 -07:00
Max Brunsfeld	0d19f157ed	Adjust some spec assertions to reflect finer-grained error recoveries	2016-04-22 10:19:44 -07:00
Max Brunsfeld	cf19b2e58d	Make repeat rules left-recursive instead of right recursive	2016-04-18 12:40:14 -07:00
Max Brunsfeld	655d374d0c	Recompile test languages if parser.h changes	2016-04-18 11:17:06 -07:00
Max Brunsfeld	cad663b144	Consider multiple error repairs on the same path of the stack This changes the API to the stack_iterate function so that you can pop from the stack without stopping iteration	2016-04-15 21:28:00 -07:00
Max Brunsfeld	695be5bc79	Merge equivalent stacks in a separate stage of parsing * No more automatic merging every time a state is pushed to the stack * When popping from the stack, the current version is always preserved	2016-04-10 14:12:24 -07:00
Max Brunsfeld	5ba40f15ad	Rename stack heads to versions	2016-04-04 12:25:57 -07:00
Max Brunsfeld	6bce6da1e6	Store `verifying` flag within parse stack	2016-03-31 12:03:21 -07:00
Max Brunsfeld	e7d3d40a59	Explicitly inform stack pop callback when the stack is exhausted Also, pass non-extra tree count as a single value, rather than keeping track of the extra count and the total separately.	2016-03-10 11:51:55 -08:00
Max Brunsfeld	240355b04c	Make test for allocation failure handling fail more gracefully	2016-03-10 11:36:26 -08:00
Max Brunsfeld	4f726da881	Fix logic for whether to regenerate parsers in specs	2016-03-10 11:35:59 -08:00
Max Brunsfeld	2e35587161	Use new stack_pop_until function for repairing errors	2016-03-07 20:06:46 -08:00
Max Brunsfeld	4348eb89d4	Expose lower stack nodes via pop_until() function This callback-based API allows the parser to easily visit each interior node of the stack when searching for an error repair. It also is a better abstraction over the stack's DAG implementation than having the public functions for accessing entries and their successor entries.	2016-03-07 16:09:34 -08:00
Max Brunsfeld	bc8df9f5c5	Avoid recompiling test languages when possible	2016-03-03 12:05:04 -08:00
Max Brunsfeld	c0595c21c5	Halt stack pops at all error states, not just error trees	2016-03-03 11:05:37 -08:00
Max Brunsfeld	3d516aeeec	Give StackPushResult enumerators shorter names	2016-03-03 10:20:05 -08:00
Max Brunsfeld	8a13b5d120	Rename StackPopResult -> StackSlice	2016-03-03 10:16:10 -08:00
Max Brunsfeld	aef7582a2a	Start using the forward move to recover from errors Some unit tests passing. Corpus tests still failing	2016-03-02 21:08:42 -08:00
Max Brunsfeld	76d072545d	Include out-of-context states starting with non-terminals	2016-03-02 20:58:39 -08:00
Max Brunsfeld	e7abfdd373	Prevent string assertion failures from creating later memory leak errors	2016-03-02 20:58:39 -08:00
Max Brunsfeld	8c01b70ce7	Don't skip tokens that are not the start of any non-terminal	2016-03-02 20:56:05 -08:00

1 2 3 4 5 ...

655 commits