tree-sitter

Author	SHA1	Message	Date
Max Brunsfeld	996ca91e70	Disallow syntax rules that match the empty string (for now)	2016-11-30 23:19:54 -08:00
Max Brunsfeld	535879a2bd	Represent byte, char and tree counts as 32 bit numbers The parser spends the majority of its time allocating and freeing trees and stack nodes. Also, the memory footprint of the AST is a significant concern when using tree-sitter with large files. This library is already unlikely to work very well with source files larger than 4GB, so representing rows, columns, byte lengths and child indices as unsigned 32 bit integers seems like the right choice.	2016-11-14 12:19:13 -08:00
Max Brunsfeld	fad7294ba4	Store shift states for non-terminals directly in the main parse table	2016-11-14 08:36:06 -08:00
Max Brunsfeld	4106ecda43	Remove logic for recovering from OOM	2016-11-04 09:18:38 -07:00
Max Brunsfeld	e53beb66c9	Avoid anonymous nested struct to silence override-init warnings	2016-10-26 11:10:56 -07:00
Max Brunsfeld	eed54d95e1	Merge branch 'master' into changed-ranges	2016-10-16 21:10:25 -07:00
Max Brunsfeld	e149d94ff5	Remove generated parsers' dependency on runtime.h	2016-10-05 14:02:49 -07:00
Max Brunsfeld	00528e50ce	Change edit API to be byte-based	2016-09-13 13:08:52 -07:00
Max Brunsfeld	cc62fe0375	Represent Lengths in terms of Points	2016-09-09 21:11:02 -07:00
Max Brunsfeld	131bbee160	Rename parse_and_diff -> parse_and_get_changed_ranges Signed-off-by: Nathan Sobo <nathan@github.com>	2016-09-08 17:51:34 -07:00
Max Brunsfeld	fce8d57152	Start work on document_parse_and_diff API	2016-09-08 17:51:20 -07:00
Max Brunsfeld	a6a08dde31	Rename ts_node_name -> ts_node_type	2016-09-06 21:43:59 -07:00
Max Brunsfeld	38241d466b	Rename .read_fn, .seek_fn -> .read, .seek	2016-09-06 21:39:10 -07:00
Max Brunsfeld	f6da44fdbb	Add ts_node_descendant_for_byte_range	2016-09-06 21:33:19 -07:00
Max Brunsfeld	70756034f1	Allow descendant queries by both 1D and 2D coordinates	2016-09-06 21:17:26 -07:00
Max Brunsfeld	096ac2d4b6	Rename ts_document_set_debugger -> ts_document_set_logger	2016-09-06 17:40:26 -07:00
Max Brunsfeld	64a6c9db0e	Rename ts_document_make -> ts_document_new	2016-09-06 17:26:18 -07:00
Max Brunsfeld	b76574e01c	Handle ambiguities between extra and non-extra tokens using normal GLR splitting	2016-09-06 10:22:16 -07:00
Max Brunsfeld	4f0c83ba01	Move logic for lexical error handling outside of lexer functions This way, less logic needs to be exposed in parser.h	2016-09-03 23:40:57 -07:00
Max Brunsfeld	1c52c30111	Fix unexpected EOF errors getting lost	2016-09-03 22:46:14 -07:00
Max Brunsfeld	8c26d99353	Store error recovery actions in the normal parse table	2016-06-27 14:07:47 -07:00
Max Brunsfeld	43ae8235fd	Remove the error action; a lack of actions implies an error.	2016-06-21 22:53:48 -07:00
Max Brunsfeld	6a7a5cfc3f	Remove nesting in parse action struct	2016-06-21 21:36:33 -07:00
Max Brunsfeld	38c144b4a3	Refine logic for deciding when tokens need to be re-lexed * While generating the lex table, note which tokens can match the same string. A token needs to be relexed when it has possible homonyms in the current state. * Also note which tokens can match substrings of each other tokens. A token needs to be relexed when there are viable tokens that could match longer strings in the current state and the next token has been edited. * Remove the logic for marking tokens as fragile on creation. * Store the reusability/non-reusability of symbols off of individual actions and onto the entire entry for the state & symbol.	2016-06-21 07:28:04 -07:00
Max Brunsfeld	45f7cee0c8	Handle extra tokens properly during error recovery	2016-06-18 20:46:25 -07:00
Max Brunsfeld	94721c7ec0	Rewind and re-tokenize in error mode after detecting an error	2016-06-17 21:26:03 -07:00
Max Brunsfeld	1e353381ff	Don't create error node in lexer unless token is completely invalid Before, any syntax error would cause the lexer to create an error leaf node. This could happen even with a valid input, if the parse stack had split and one particular version of the parse stack failed to parse. Now, an error leaf node is only created when the lexer cannot understand part of the input stream at all. When a normal syntax error occurs, the lexer just returns a token that is outside of the expected token set, and the parser handles the unexpected token.	2016-05-26 14:15:10 -07:00
Max Brunsfeld	a3679fbb1f	Distinguish separators from main tokens via a property on transitions It was incorrect to store it as a property on the lexical states themselves	2016-05-19 16:27:25 -07:00
Max Brunsfeld	31cc6e6f9c	Remove unused InProgressSymbolEntry typedef	2016-05-16 12:46:29 -07:00
Max Brunsfeld	22c550c9d6	Discard tokens after error detection to find the best repair * Use GLR stack-splitting to try all numbers of tokens to discard until a repair is found. * Check the validity of repairs by looking at the child trees, rather than the statically-computed 'in-progress symbols' list	2016-05-11 13:49:43 -07:00
Max Brunsfeld	9d247e45b2	Deemphasize extra trees in stack debugging graphs	2016-05-01 15:24:50 -07:00
Max Brunsfeld	9ad1e36238	Rename out_of_context_states -> recovery_states	2016-04-27 14:14:56 -07:00
Max Brunsfeld	f63fcffe95	Fix incorrect cast in ts_language_symbol_is_in_progress	2016-04-18 11:17:07 -07:00
Max Brunsfeld	e0c24e3be6	Remove old error recovery code	2016-03-02 20:58:39 -08:00
Max Brunsfeld	c8d7c16f87	Use out-of-context states when in error parse state	2016-03-02 20:56:05 -08:00
Max Brunsfeld	9b2e775b79	Store out-of-context states in the language struct	2016-03-02 20:56:05 -08:00
Max Brunsfeld	ffcd8b5c49	Generate C code for the in-progress symbols in each parse state	2016-03-02 20:56:05 -08:00
Max Brunsfeld	abbc282950	Add a public function for printing debugging graphs	2016-02-23 11:16:50 -08:00
Max Brunsfeld	2b35890bbb	Add ts_node_symbols() function	2016-02-19 15:41:30 -08:00
Max Brunsfeld	b80a330a74	Fix assorted memory leaks in test code	2016-02-05 12:23:54 -08:00
Max Brunsfeld	3dde0a6f39	Handle allocation failures during parsing	2016-01-19 18:08:01 -08:00
Max Brunsfeld	9d0835edbf	Return non-const string from ts_node_string The caller should free the string.	2016-01-18 10:27:23 -08:00
Max Brunsfeld	d4632ab9a9	Make the compile function plain C and take a JSON grammar	2016-01-11 12:33:48 -08:00
Max Brunsfeld	b69e19c525	Add plain C API for compiling a JSON grammar	2016-01-10 13:44:22 -08:00
Max Brunsfeld	36870bfced	Make Grammar a simple struct	2016-01-08 15:51:30 -08:00
Max Brunsfeld	4b04afac5e	Control lexer's error-mode via explicit boolean argument Previously, the lexer would operate in error-mode (ignoring any garbage input until it found a valid token) if it was invoked in the 'error' state. Now that the error state is deduped with other lexical states, the lexer might be invoked in that state even when error-mode is not intended. This adds a third argument to `ts_lex` that explicitly sets the error-mode. This bug was unlikely to occur in any real grammars, but it caused the node-tree-sitter-compiler test suite to fail for some grammars with only one rule.	2015-12-30 09:43:12 -08:00
Max Brunsfeld	4ad1a666be	clang-format	2015-12-29 21:17:31 -08:00
Max Brunsfeld	97a281502e	Store parse table more compactly	2015-12-29 11:27:41 -08:00
Max Brunsfeld	f2e7058ad9	Support UTF16 directly This makes the API easier to use from javascript	2015-12-28 13:53:22 -08:00
Max Brunsfeld	2bcd2e4d00	Reuse fragile tokens that came from the current lex state	2015-12-21 16:04:11 -08:00

1 2 3 4 5 ...

267 commits