tree-sitter

Author	SHA1	Message	Date
Max Brunsfeld	230f89d0ff	Fix build warnings in tests	2017-08-07 12:19:10 -07:00
Max Brunsfeld	94dc703bfc	Require that grammars' start rules be visible	2017-08-04 17:07:37 -07:00
Max Brunsfeld	1dca3a0b58	Simplify parse version reordering	2017-08-04 14:51:14 -07:00
Max Brunsfeld	e5c3bf742d	Update fixture grammars	2017-08-03 16:32:39 -07:00
Max Brunsfeld	84e4114f79	Allow conflicts involving repeat rules to be whitelisted, via their parent rule	2017-08-03 15:18:29 -07:00
Max Brunsfeld	119c67dd78	Fix conflict reporting for shift/reduce conflicts w/ multiple reductions We were failing to rule out shift actions with lower precedence. Signed-off-by: Philip Turnbull <philipturnbull@github.com>	2017-08-02 15:13:30 -07:00
Max Brunsfeld	09f4796f6b	Get tests passing w/ new alias API	2017-08-01 14:35:34 -07:00
Max Brunsfeld	cb5fe80348	Rename RENAME rule to ALIAS, allow it to create anonymous nodes	2017-07-31 16:41:11 -07:00
Max Brunsfeld	cbdfd89675	Mark reductions as fragile based on their final properties We previously maintained a set of individual productions that were involved in conflicts, but that was subtly incorrect because we don't compare productions themselves when comparing parse items; we only compare the parse items properties that could affect the final reduce actions.	2017-07-21 09:54:24 -07:00
Max Brunsfeld	7d9d8bce79	Handle inlined rules that contain other inlined rules	2017-07-20 15:29:06 -07:00
Max Brunsfeld	f33421c53e	Fix incorrect node renames in the presence of extra tokens	2017-07-18 21:24:34 -07:00
Max Brunsfeld	10d28d4b56	Merge pull request #92 from tree-sitter/utf16-oob Add test for UTF16 out-of-bound read	2017-07-18 17:24:31 -07:00
Phil Turnbull	52cec9ed39	Rework SpyInput buffer handling SpyInput uses a fixed-size buffer and explicitly zeros memory which is good for catching logic errors but defeats valgrind's memory tracking. Use a separate buffer of exactly the correct size for each request. This correctly catches the problem under valgrind: ``` ==8694== Invalid read of size 2 ==8694== at 0x54EFFB: utf16_iterate (utf16.c:10) ==8694== by 0x551126: ts_lexer__get_lookahead (lexer.c:54) ==8694== by 0x5515CD: ts_lexer_start (lexer.c:154) ==8694== by 0x54699F: parser(long,...)(long long) (parser.c:297) ==8694== by 0x54788A: parser__get_lookahead (parser.c:439) ==8694== by 0x54B2D3: parser__advance (parser.c:1150) ==8694== by 0x54C2AA: parser_parse (parser.c:1348) ==8694== by 0x53F063: ts_document_parse_with_options (document.c:136) ==8694== by 0x53EF43: ts_document_parse (document.c:107) ==8694== by 0x4AED11: {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}::operator()() const (document_test.cc:82) ==8694== by 0x4B56B6: std::_Function_handler<void (), {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}>::_M_invoke(std::_Any_data const&) (functional:1871) ==8694== by 0x40F8C5: std::function<void ()>::operator()() const (functional:2267) ==8694== Address 0x5d08be0 is 0 bytes inside a block of size 1 alloc'd ==8694== at 0x4C2E80F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==8694== by 0x507C3E: SpyInput::read(void, unsigned int) (spy_input.cc:66) ==8694== by 0x55103D: ts_lexer__get_chunk (lexer.c:29) ==8694== by 0x5515B6: ts_lexer_start (lexer.c:152) ==8694== by 0x54699F: parser(long,...)(long long) (parser.c:297) ==8694== by 0x54788A: parser__get_lookahead (parser.c:439) ==8694== by 0x54B2D3: parser__advance (parser.c:1150) ==8694== by 0x54C2AA: parser_parse (parser.c:1348) ==8694== by 0x53F063: ts_document_parse_with_options (document.c:136) ==8694== by 0x53EF43: ts_document_parse (document.c:107) ==8694== by 0x4AED11: {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}::operator()() const (document_test.cc:82) ==8694== by 0x4B56B6: std::_Function_handler<void (), {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}>::_M_invoke(std::_Any_data const&) (functional:1871) ```	2017-07-18 12:16:37 -07:00
Max Brunsfeld	afb499bf2e	Handle rename symbols in ts_language APIs	2017-07-18 12:01:52 -07:00
Max Brunsfeld	de17c92462	Fix setup in stack test	2017-07-18 08:21:35 -07:00
Max Brunsfeld	085d96d89d	Add bash examples to benchmarks	2017-07-17 17:50:04 -07:00
Max Brunsfeld	45c40c8742	Update test grammars to use new serialization API	2017-07-17 17:46:46 -07:00
Max Brunsfeld	9a04231ab1	Remove length restriction in external scanner serialization API	2017-07-17 17:12:36 -07:00
Phil Turnbull	e7662c2213	Handle out-of-bound read in utf16_iterate Also simplify the test so we call `utf16_iterate` directly. Calling `utf16_iterate` via `SpyInput` and `ts_document_parse` doesn't seem to reliably trigger the problem using valgrind. valgrind also doesn't detect the problem if we use a string literal like: `utf16_iterate("", 1, &code_point);`	2017-07-17 13:57:12 -07:00
Phil Turnbull	035abc1e15	Add test for UTF16 out-of-bound read utf16_iterate does not check that 'length' is a multiple of two which leads to an out-of-bound read: ==105293== Conditional jump or move depends on uninitialised value(s) ==105293== at 0x54F014: utf16_iterate (utf16.c:7) ==105293== by 0x539251: string_iterate(TSInputEncoding, unsigned char const, unsigned long, int) (encoding_helpers.cc:15) ==105293== by 0x53939D: string_byte_for_character(TSInputEncoding, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, unsigned long) (encoding_helpers.cc:43) ==105293== by 0x507BAD: SpyInput::read(void, unsigned int) (spy_input.cc:47) ==105293== by 0x551049: ts_lexer__get_chunk (lexer.c:29) ==105293== by 0x5515C2: ts_lexer_start (lexer.c:152) ==105293== by 0x5469AB: parser(long,...)(long long) (parser.c:297) ==105293== by 0x547896: parser__get_lookahead (parser.c:439) ==105293== by 0x54B2DF: parser__advance (parser.c:1150) ==105293== by 0x54C2B6: parser_parse (parser.c:1348) ==105293== by 0x53F06F: ts_document_parse_with_options (document.c:136) ==105293== by 0x53EF4F: ts_document_parse (document.c:107)	2017-07-17 12:34:39 -07:00
Max Brunsfeld	34279257f9	Merge pull request #91 from tree-sitter/libFuzzer Add support for fuzzing with libFuzzer	2017-07-17 11:43:01 -07:00
Phil Turnbull	798ef5e4dc	Add libFuzzer support This adds support for fuzzing tree-sitter grammars with libFuzzer. This currently only works on Linux because of linking issues on macOS. Breifly, the AddressSanitizer library is dynamically linked into the fuzzer binary and cannot be found at runtime if built with a compiler that wasn't provided by Xcode(?). The runtime library is statically linked on Linux so this isn't a problem.	2017-07-14 13:50:41 -07:00
Max Brunsfeld	a22386e408	Fix compiler warnings in flatten_grammar_test	2017-07-14 10:26:34 -07:00
Max Brunsfeld	4b40a1ed6c	Support anonymous tokens inside of RENAME rules	2017-07-14 10:19:58 -07:00
Max Brunsfeld	b3a72954ff	Introduce RENAME rule type	2017-07-13 17:17:22 -07:00
Max Brunsfeld	7293e6f0cc	Fix compile warnings	2017-07-12 22:08:36 -07:00
Max Brunsfeld	a3006bc2b5	Represent LookaheadSet using vectors of bool	2017-07-12 16:02:01 -07:00
Max Brunsfeld	e4f57d6fee	Test more cases in fixture grammar with inline rules	2017-07-12 10:12:42 -07:00
Max Brunsfeld	5c8f7c035e	Add stream operator for ParseItemSet	2017-07-12 09:42:56 -07:00
Max Brunsfeld	65bf1389e1	Add a way to automatically inline rules	2017-07-11 23:13:44 -07:00
Max Brunsfeld	26a25278cd	When comparing parse items, ignore consumed part of their productions This speeds up parser generation by increasing the likelihood that we'll recognize parse item sets as equivalent in advance, rather than having to merge their states after the fact.	2017-07-11 17:30:32 -07:00
Max Brunsfeld	59236d2ed1	Avoid redundant character comparisons in generated lex function	2017-07-10 14:09:31 -07:00
Max Brunsfeld	1586d70cbe	Compute conflicting tokens more precisely While generating the parse table, keep track of which tokens can follow one another. Then use this information to evaluate token conflicts more precisely. This will result in a smaller parse table than the previous, overly-conservative approach.	2017-07-07 17:54:24 -07:00
Max Brunsfeld	a98abde529	Provide all preceding symbols as context when reporting conflicts	2017-07-07 14:52:56 -07:00
Max Brunsfeld	d8e9d04fe7	Add PREC_DYNAMIC rule for resolving runtime ambiguities	2017-07-06 15:24:45 -07:00
Max Brunsfeld	cb652239f6	Add missing semicolons in flatten_grammar test	2017-07-06 12:48:50 -07:00
Max Brunsfeld	96068bbacb	Represent all speeds as size_t in benchmarks	2017-07-06 12:24:41 -07:00
Max Brunsfeld	8c005cddc6	Print average and worst speeds in benchmark command	2017-07-06 10:12:22 -07:00
Max Brunsfeld	8f028ebf68	Avoid deep tree comparison when both trees have errors	2017-07-05 17:33:35 -07:00
Max Brunsfeld	c53f9bcbd9	Build benchmarks in Test mode for now	2017-07-05 17:27:50 -07:00
Max Brunsfeld	17bc3dfaf7	Add a benchmark command This command measures the speed of parsing each grammar's examples. It also uses each grammar to parse all of the other grammars' examples in order to measure error recovery performance with fairly large files.	2017-07-05 14:14:38 -07:00
Max Brunsfeld	298228d8de	Clean up test_grammars file	2017-07-05 11:39:33 -07:00
Max Brunsfeld	d322f0b6a7	🎨	2017-07-04 21:59:54 -07:00
Max Brunsfeld	f93f78ef2d	Remove version-pruning criteria based on pushed node count	2017-07-02 23:42:23 -07:00
Max Brunsfeld	fcffd4b732	Add test for an example found during fuzzing	2017-06-30 21:55:50 -07:00
Max Brunsfeld	eccb3893eb	Prune unneeded stack versions based on a depth criteria	2017-06-30 17:49:09 -07:00
Max Brunsfeld	a89322c5f1	Remove unneeded parameters from public interface of stack_iterate callback	2017-06-29 16:43:56 -07:00
Max Brunsfeld	009d6d1534	Improve heuristics for pruning parse versions based on errors * Rewrite the error cost comparison in terms of explicit, discrete conditions. * Allow merging versions have different error costs. * Store the depth of each stack version since the last error. Use this state to prevent incorrect merging. * Sort the stack versions in order of preference and put a hard limit on the version count.	2017-06-29 15:00:20 -07:00
Max Brunsfeld	66be393b78	Stack - consider empty external token state identical to NULL	2017-06-29 15:00:20 -07:00
Max Brunsfeld	0143bfdad4	Avoid use-after-free of external token states Previously, it was possible for references to external token states to outlive the trees to which those states belonged. Now, instead of storing references to external token states in the Stack and in the Lexer, we store references to the external token trees themselves, and we retain the trees to prevent use-after-free.	2017-06-27 14:54:27 -07:00

1 2

85 commits