tree-sitter

Author	SHA1	Message	Date
Max Brunsfeld	a89322c5f1	Remove unneeded parameters from public interface of stack_iterate callback	2017-06-29 16:43:56 -07:00
Max Brunsfeld	009d6d1534	Improve heuristics for pruning parse versions based on errors * Rewrite the error cost comparison in terms of explicit, discrete conditions. * Allow merging versions have different error costs. * Store the depth of each stack version since the last error. Use this state to prevent incorrect merging. * Sort the stack versions in order of preference and put a hard limit on the version count.	2017-06-29 15:00:20 -07:00
Max Brunsfeld	445be0736a	Clean up ts_stack_push function	2017-06-29 15:00:20 -07:00
Max Brunsfeld	0143bfdad4	Avoid use-after-free of external token states Previously, it was possible for references to external token states to outlive the trees to which those states belonged. Now, instead of storing references to external token states in the Stack and in the Lexer, we store references to the external token trees themselves, and we retain the trees to prevent use-after-free.	2017-06-27 14:54:27 -07:00
Max Brunsfeld	d57043b665	Add ability to store external token state per stack version	2017-01-04 21:22:23 -08:00
Max Brunsfeld	e7217f1bac	Clean up some methods in parser.c	2016-11-14 17:25:55 -08:00
Max Brunsfeld	535879a2bd	Represent byte, char and tree counts as 32 bit numbers The parser spends the majority of its time allocating and freeing trees and stack nodes. Also, the memory footprint of the AST is a significant concern when using tree-sitter with large files. This library is already unlikely to work very well with source files larger than 4GB, so representing rows, columns, byte lengths and child indices as unsigned 32 bit integers seems like the right choice.	2016-11-14 12:19:13 -08:00
Max Brunsfeld	c9dcb29c6f	Remove the TS prefix from some internal type/function names	2016-11-09 20:59:05 -08:00
Max Brunsfeld	4106ecda43	Remove logic for recovering from OOM	2016-11-04 09:18:38 -07:00
Max Brunsfeld	e149d94ff5	Remove generated parsers' dependency on runtime.h	2016-10-05 14:02:49 -07:00
Max Brunsfeld	e0b0e29a2b	Update parse count correctly when repairing errors & undoing reductions	2016-09-01 10:04:20 -07:00
Max Brunsfeld	7483da4184	Add push_count to stack, use it in error comparisons	2016-08-31 17:29:14 -07:00
Max Brunsfeld	0faae52132	Fix some inconsistencies in error cost calculation Signed-off-by: Nathan Sobo <nathan@github.com>	2016-08-31 10:51:59 -07:00
Max Brunsfeld	52ccebbf80	Rename error_depth -> error_count	2016-08-30 09:44:40 -07:00
Max Brunsfeld	00a0939504	Abort erroneous parse versions more eagerly	2016-06-02 14:04:48 -07:00
Max Brunsfeld	ea47fdc0fe	Rework logic for when to abandon parses with errors	2016-05-29 22:36:47 -07:00
Max Brunsfeld	6535704870	Replace stack_merge_new function with two simpler functions - merge(version1, version2) - split(version)	2016-05-28 21:22:10 -07:00
Max Brunsfeld	e686478ad2	Rename stack_merge function to stack_merge_all	2016-05-28 20:24:08 -07:00
Max Brunsfeld	1e353381ff	Don't create error node in lexer unless token is completely invalid Before, any syntax error would cause the lexer to create an error leaf node. This could happen even with a valid input, if the parse stack had split and one particular version of the parse stack failed to parse. Now, an error leaf node is only created when the lexer cannot understand part of the input stream at all. When a normal syntax error occurs, the lexer just returns a token that is outside of the expected token set, and the parser handles the unexpected token.	2016-05-26 14:15:10 -07:00
Max Brunsfeld	88053cf723	In tests, don’t record allocations while printing debug graphs	2016-05-16 10:44:19 -07:00
Max Brunsfeld	d50f6a58cc	Abort parse versions w/ worse errors when repairing an error	2016-05-16 10:33:19 -07:00
Max Brunsfeld	22c550c9d6	Discard tokens after error detection to find the best repair * Use GLR stack-splitting to try all numbers of tokens to discard until a repair is found. * Check the validity of repairs by looking at the child trees, rather than the statically-computed 'in-progress symbols' list	2016-05-11 13:49:43 -07:00
Max Brunsfeld	e99a3925e0	Merge all versions created in a given reduce operation	2016-04-24 00:55:19 -07:00
Max Brunsfeld	fd4c33209e	Select ambiguous alternatives by minimizing error size	2016-04-24 00:54:20 -07:00
Max Brunsfeld	cad663b144	Consider multiple error repairs on the same path of the stack This changes the API to the stack_iterate function so that you can pop from the stack without stopping iteration	2016-04-15 21:28:00 -07:00
Max Brunsfeld	695be5bc79	Merge equivalent stacks in a separate stage of parsing * No more automatic merging every time a state is pushed to the stack * When popping from the stack, the current version is always preserved	2016-04-10 14:12:24 -07:00
Max Brunsfeld	5ba40f15ad	Rename stack heads to versions	2016-04-04 12:25:57 -07:00
Max Brunsfeld	b1a696085a	Clean up stack pop functions	2016-04-04 11:59:10 -07:00
Max Brunsfeld	2f3e92c9be	Add function for popping all nodes from the stack	2016-04-04 11:44:45 -07:00
Max Brunsfeld	91e3609fbf	Write to file directly from stack debugging function	2016-04-02 22:18:44 -07:00
Max Brunsfeld	6bce6da1e6	Store `verifying` flag within parse stack	2016-03-31 12:03:21 -07:00
Max Brunsfeld	e7d3d40a59	Explicitly inform stack pop callback when the stack is exhausted Also, pass non-extra tree count as a single value, rather than keeping track of the extra count and the total separately.	2016-03-10 11:51:55 -08:00
Max Brunsfeld	4348eb89d4	Expose lower stack nodes via pop_until() function This callback-based API allows the parser to easily visit each interior node of the stack when searching for an error repair. It also is a better abstraction over the stack's DAG implementation than having the public functions for accessing entries and their successor entries.	2016-03-07 16:09:34 -08:00
Max Brunsfeld	c0595c21c5	Halt stack pops at all error states, not just error trees	2016-03-03 11:05:37 -08:00
Max Brunsfeld	3d516aeeec	Give StackPushResult enumerators shorter names	2016-03-03 10:20:05 -08:00
Max Brunsfeld	8a13b5d120	Rename StackPopResult -> StackSlice	2016-03-03 10:16:10 -08:00
Max Brunsfeld	5a34d74702	Clean up stack	2016-02-25 21:51:39 -08:00
Max Brunsfeld	da2ef7ad35	Store trees in the links between stack nodes, not in the nodes themselves	2016-02-23 17:35:50 -08:00
Max Brunsfeld	6dd92c3abe	Add function for rendering the stack as a DOT graph	2016-02-23 00:08:55 -08:00
Max Brunsfeld	f444a715fd	Clean up tree array assertions in stack spec	2016-02-22 09:23:25 -08:00
Max Brunsfeld	b113dc8b0f	Return a TreeArray from ts_stack_pop Since the capacity is now included in the return value, the buffer can be reused in the ts_parser__accept function. Also, it's just cleaner to use Array consistently, rather than a separate buffer and size.	2016-02-21 22:31:13 -08:00
Max Brunsfeld	3d7df851b5	Rename Vector -> Array	2016-02-17 20:41:29 -08:00
Max Brunsfeld	6fa7eca966	Make vector struct type-safe	2016-02-17 15:30:47 -08:00
Max Brunsfeld	e90a425618	Only return one result for each revealed head from ts_stack_pop	2016-02-08 12:08:15 -08:00
Max Brunsfeld	3dde0a6f39	Handle allocation failures during parsing	2016-01-19 18:08:01 -08:00
Max Brunsfeld	7fbb628c78	Remove TreeSelectionCallback struct Just make a typedef for the function type	2015-12-17 12:09:06 -08:00
Max Brunsfeld	10286f307f	Pass reference to parser in stack's tree selection callback	2015-12-08 12:21:27 -08:00
Max Brunsfeld	d2bf88d5fe	Include rows and columns in TSLength This way, we don't have to have separate 1D and 2D versions for so many values	2015-12-04 20:20:29 -08:00
Max Brunsfeld	863cabc827	Don't include trailing ubiquitous tokens as children when reducing	2015-12-02 15:31:15 -08:00
Max Brunsfeld	c88e9044d5	Make stack popping more robust	2015-11-20 00:04:21 -08:00

1 2

67 commits