Commit graph

6075 commits

Author SHA1 Message Date
Phil Turnbull
e7662c2213 Handle out-of-bound read in utf16_iterate
Also simplify the test so we call `utf16_iterate` directly. Calling
`utf16_iterate` via `SpyInput` and `ts_document_parse` doesn't seem to reliably
trigger the problem using valgrind.

valgrind also doesn't detect the problem if we use a string literal like:
  `utf16_iterate("", 1, &code_point);`
2017-07-17 13:57:12 -07:00
Phil Turnbull
035abc1e15 Add test for UTF16 out-of-bound read
utf16_iterate does not check that 'length' is a multiple of two which leads to
an out-of-bound read:

==105293== Conditional jump or move depends on uninitialised value(s)
==105293==    at 0x54F014: utf16_iterate (utf16.c:7)
==105293==    by 0x539251: string_iterate(TSInputEncoding, unsigned char const*, unsigned long, int*) (encoding_helpers.cc:15)
==105293==    by 0x53939D: string_byte_for_character(TSInputEncoding, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, unsigned long) (encoding_helpers.cc:43)
==105293==    by 0x507BAD: SpyInput::read(void*, unsigned int*) (spy_input.cc:47)
==105293==    by 0x551049: ts_lexer__get_chunk (lexer.c:29)
==105293==    by 0x5515C2: ts_lexer_start (lexer.c:152)
==105293==    by 0x5469AB: parser(long,...)(long long) (parser.c:297)
==105293==    by 0x547896: parser__get_lookahead (parser.c:439)
==105293==    by 0x54B2DF: parser__advance (parser.c:1150)
==105293==    by 0x54C2B6: parser_parse (parser.c:1348)
==105293==    by 0x53F06F: ts_document_parse_with_options (document.c:136)
==105293==    by 0x53EF4F: ts_document_parse (document.c:107)
2017-07-17 12:34:39 -07:00
Max Brunsfeld
34279257f9 Merge pull request #91 from tree-sitter/libFuzzer
Add support for fuzzing with libFuzzer
2017-07-17 11:43:01 -07:00
Max Brunsfeld
66dc12587a Call the external scanner whenever an external token is valid
For some reason, there was previously some extra logic that prevented
the external scanner from being invoked if the only valid external
token also had an internal definition.

It's surprising to not call the external scanner if an external
token is valid.
2017-07-17 10:28:59 -07:00
Phil Turnbull
153c2033df Update list of test grammars 2017-07-14 13:50:42 -07:00
Phil Turnbull
798ef5e4dc Add libFuzzer support
This adds support for fuzzing tree-sitter grammars with libFuzzer. This
currently only works on Linux because of linking issues on macOS. Breifly, the
AddressSanitizer library is dynamically linked into the fuzzer binary and
cannot be found at runtime if built with a compiler that wasn't provided by
Xcode(?). The runtime library is statically linked on Linux so this isn't a
problem.
2017-07-14 13:50:41 -07:00
Max Brunsfeld
1a195d44bb Whoops, dynamic precedence needs a sign 2017-07-14 11:06:16 -07:00
Max Brunsfeld
9c9311ccd7 Merge pull request #90 from tree-sitter/rename-rules
Add an API for renaming nodes based on their context
2017-07-14 10:51:22 -07:00
Max Brunsfeld
c21d3653e8 Add inline property to grammar JSON schema 2017-07-14 10:46:33 -07:00
Max Brunsfeld
99885788bc 🎨 2017-07-14 10:41:09 -07:00
Max Brunsfeld
a22386e408 Fix compiler warnings in flatten_grammar_test 2017-07-14 10:26:34 -07:00
Max Brunsfeld
4b40a1ed6c Support anonymous tokens inside of RENAME rules 2017-07-14 10:19:58 -07:00
Max Brunsfeld
b3a72954ff Introduce RENAME rule type 2017-07-13 17:17:22 -07:00
Max Brunsfeld
0b94e9d814 Don't include preceding production steps in ParseItem hash 2017-07-13 13:42:28 -07:00
Max Brunsfeld
561821d011 Remove precedence and associativity methods from ParseAction 2017-07-13 13:41:56 -07:00
Max Brunsfeld
d646889922 Simplify flatten_rule function 2017-07-13 09:59:23 -07:00
Max Brunsfeld
7293e6f0cc Fix compile warnings 2017-07-12 22:08:36 -07:00
Max Brunsfeld
62c577af33 Remove unnecessary using statements 2017-07-12 21:41:37 -07:00
Max Brunsfeld
2659b15542 Only build master and PRs on travis 2017-07-12 21:40:57 -07:00
Max Brunsfeld
8f30b259f1 Merge pull request #89 from tree-sitter/inline-rules
Add a way to automatically inline rules
2017-07-12 17:16:47 -07:00
Max Brunsfeld
a3006bc2b5 Represent LookaheadSet using vectors of bool 2017-07-12 16:02:01 -07:00
Max Brunsfeld
e4f57d6fee Test more cases in fixture grammar with inline rules 2017-07-12 10:12:42 -07:00
Max Brunsfeld
5c8f7c035e Add stream operator for ParseItemSet 2017-07-12 09:42:56 -07:00
Max Brunsfeld
65bf1389e1 Add a way to automatically inline rules 2017-07-11 23:13:44 -07:00
Max Brunsfeld
26a25278cd When comparing parse items, ignore consumed part of their productions
This speeds up parser generation by increasing the likelihood that we'll recognize
parse item sets as equivalent in advance, rather than having to merge their states
after the fact.
2017-07-11 17:30:32 -07:00
Max Brunsfeld
a199b217f3 Optimize ParseTableBuilder for non-terminals w/ many productions 2017-07-11 12:54:29 -07:00
Max Brunsfeld
68c3ba1b8b 🎨 merge_parse_state 2017-07-10 16:46:11 -07:00
Max Brunsfeld
43d347c225 Merge pull request #87 from tree-sitter/dynamic-precedence
Introduce rule for resolving runtime ambiguities
2017-07-10 16:43:58 -07:00
Max Brunsfeld
5bd5b4bb05 Replace <cctype> -> <cwctype> 2017-07-10 14:35:14 -07:00
Max Brunsfeld
bf4b8bf55b Use my fork of crypto-algorithms 2017-07-10 14:29:14 -07:00
Max Brunsfeld
59236d2ed1 Avoid redundant character comparisons in generated lex function 2017-07-10 14:09:31 -07:00
Max Brunsfeld
2755b07222 Don't store unfinished item signature on ParseStates 2017-07-10 10:47:38 -07:00
Max Brunsfeld
1586d70cbe Compute conflicting tokens more precisely
While generating the parse table, keep track of which tokens can follow one another.
Then use this information to evaluate token conflicts more precisely. This will
result in a smaller parse table than the previous, overly-conservative approach.
2017-07-07 17:54:24 -07:00
Max Brunsfeld
a98abde529 Provide all preceding symbols as context when reporting conflicts 2017-07-07 14:52:56 -07:00
Max Brunsfeld
c91ceaaa8d 🎨 build_parse_table 2017-07-07 14:52:45 -07:00
Max Brunsfeld
0de93b3bf2 Allow negative dynamic precedences 2017-07-06 22:21:59 -07:00
Max Brunsfeld
107feb7960 Bump the language version number after adding dynamic precedences 2017-07-06 15:58:29 -07:00
Max Brunsfeld
08bb365f6c Allow PREC_DYNAMIC in JSON schema 2017-07-06 15:51:03 -07:00
Max Brunsfeld
d8e9d04fe7 Add PREC_DYNAMIC rule for resolving runtime ambiguities 2017-07-06 15:24:45 -07:00
Max Brunsfeld
cb652239f6 Add missing semicolons in flatten_grammar test 2017-07-06 12:48:50 -07:00
Max Brunsfeld
31f827945a Merge pull request #86 from tree-sitter/benchmarks
Add a benchmark command
2017-07-06 12:38:00 -07:00
Max Brunsfeld
21bc50377e Run make with the right target when building benchmarks on CI 2017-07-06 12:36:57 -07:00
Max Brunsfeld
96068bbacb Represent all speeds as size_t in benchmarks 2017-07-06 12:24:41 -07:00
Max Brunsfeld
2b73a30fba Build benchmarks in release mode 2017-07-06 11:49:32 -07:00
Max Brunsfeld
a64db98218 Rename lib.sh -> scan-build.sh 2017-07-06 10:32:41 -07:00
Max Brunsfeld
78333b70c0 Build benchmarks with scan-build on CI 2017-07-06 10:22:14 -07:00
Max Brunsfeld
8c005cddc6 Print average and worst speeds in benchmark command 2017-07-06 10:12:22 -07:00
Max Brunsfeld
8f028ebf68 Avoid deep tree comparison when both trees have errors 2017-07-05 17:33:35 -07:00
Max Brunsfeld
c53f9bcbd9 Build benchmarks in Test mode for now 2017-07-05 17:27:50 -07:00
Max Brunsfeld
782bf48772 Don't do skip_preceding_subtrees recovery when there are lots of versions 2017-07-05 15:34:19 -07:00