Commit graph

37 commits

Author SHA1 Message Date
Max Brunsfeld
f6325746aa Provide symbol metadata with dummy language in stack test 2017-08-08 17:47:24 -07:00
Max Brunsfeld
cc7277fd7d Avoid using IsNull bandit assertion 2017-08-08 12:52:35 -07:00
Max Brunsfeld
94dc703bfc Require that grammars' start rules be visible 2017-08-04 17:07:37 -07:00
Max Brunsfeld
e5c3bf742d Update fixture grammars 2017-08-03 16:32:39 -07:00
Max Brunsfeld
09f4796f6b Get tests passing w/ new alias API 2017-08-01 14:35:34 -07:00
Max Brunsfeld
cb5fe80348 Rename RENAME rule to ALIAS, allow it to create anonymous nodes 2017-07-31 16:41:11 -07:00
Max Brunsfeld
cbdfd89675 Mark reductions as fragile based on their final properties
We previously maintained a set of individual productions that were
involved in conflicts, but that was subtly incorrect because
we don't compare productions themselves when comparing parse items;
we only compare the parse items properties that could affect the
final reduce actions.
2017-07-21 09:54:24 -07:00
Max Brunsfeld
f33421c53e Fix incorrect node renames in the presence of extra tokens 2017-07-18 21:24:34 -07:00
Max Brunsfeld
10d28d4b56 Merge pull request #92 from tree-sitter/utf16-oob
Add test for UTF16 out-of-bound read
2017-07-18 17:24:31 -07:00
Phil Turnbull
52cec9ed39 Rework SpyInput buffer handling
SpyInput uses a fixed-size buffer and explicitly zeros memory which is good for
catching logic errors but defeats valgrind's memory tracking. Use a separate
buffer of exactly the correct size for each request. This correctly catches the
problem under valgrind:

```
==8694== Invalid read of size 2
==8694==    at 0x54EFFB: utf16_iterate (utf16.c:10)
==8694==    by 0x551126: ts_lexer__get_lookahead (lexer.c:54)
==8694==    by 0x5515CD: ts_lexer_start (lexer.c:154)
==8694==    by 0x54699F: parser(long,...)(long long) (parser.c:297)
==8694==    by 0x54788A: parser__get_lookahead (parser.c:439)
==8694==    by 0x54B2D3: parser__advance (parser.c:1150)
==8694==    by 0x54C2AA: parser_parse (parser.c:1348)
==8694==    by 0x53F063: ts_document_parse_with_options (document.c:136)
==8694==    by 0x53EF43: ts_document_parse (document.c:107)
==8694==    by 0x4AED11: {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}::operator()() const (document_test.cc:82)
==8694==    by 0x4B56B6: std::_Function_handler<void (), {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}>::_M_invoke(std::_Any_data const&) (functional:1871)
==8694==    by 0x40F8C5: std::function<void ()>::operator()() const (functional:2267)
==8694==  Address 0x5d08be0 is 0 bytes inside a block of size 1 alloc'd
==8694==    at 0x4C2E80F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8694==    by 0x507C3E: SpyInput::read(void*, unsigned int*) (spy_input.cc:66)
==8694==    by 0x55103D: ts_lexer__get_chunk (lexer.c:29)
==8694==    by 0x5515B6: ts_lexer_start (lexer.c:152)
==8694==    by 0x54699F: parser(long,...)(long long) (parser.c:297)
==8694==    by 0x54788A: parser__get_lookahead (parser.c:439)
==8694==    by 0x54B2D3: parser__advance (parser.c:1150)
==8694==    by 0x54C2AA: parser_parse (parser.c:1348)
==8694==    by 0x53F063: ts_document_parse_with_options (document.c:136)
==8694==    by 0x53EF43: ts_document_parse (document.c:107)
==8694==    by 0x4AED11: {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}::operator()() const (document_test.cc:82)
==8694==    by 0x4B56B6: std::_Function_handler<void (), {lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#4}::operator()() const::{lambda()#4}>::_M_invoke(std::_Any_data const&) (functional:1871)
```
2017-07-18 12:16:37 -07:00
Max Brunsfeld
afb499bf2e Handle rename symbols in ts_language APIs 2017-07-18 12:01:52 -07:00
Max Brunsfeld
de17c92462 Fix setup in stack test 2017-07-18 08:21:35 -07:00
Max Brunsfeld
9a04231ab1 Remove length restriction in external scanner serialization API 2017-07-17 17:12:36 -07:00
Phil Turnbull
e7662c2213 Handle out-of-bound read in utf16_iterate
Also simplify the test so we call `utf16_iterate` directly. Calling
`utf16_iterate` via `SpyInput` and `ts_document_parse` doesn't seem to reliably
trigger the problem using valgrind.

valgrind also doesn't detect the problem if we use a string literal like:
  `utf16_iterate("", 1, &code_point);`
2017-07-17 13:57:12 -07:00
Phil Turnbull
035abc1e15 Add test for UTF16 out-of-bound read
utf16_iterate does not check that 'length' is a multiple of two which leads to
an out-of-bound read:

==105293== Conditional jump or move depends on uninitialised value(s)
==105293==    at 0x54F014: utf16_iterate (utf16.c:7)
==105293==    by 0x539251: string_iterate(TSInputEncoding, unsigned char const*, unsigned long, int*) (encoding_helpers.cc:15)
==105293==    by 0x53939D: string_byte_for_character(TSInputEncoding, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, unsigned long) (encoding_helpers.cc:43)
==105293==    by 0x507BAD: SpyInput::read(void*, unsigned int*) (spy_input.cc:47)
==105293==    by 0x551049: ts_lexer__get_chunk (lexer.c:29)
==105293==    by 0x5515C2: ts_lexer_start (lexer.c:152)
==105293==    by 0x5469AB: parser(long,...)(long long) (parser.c:297)
==105293==    by 0x547896: parser__get_lookahead (parser.c:439)
==105293==    by 0x54B2DF: parser__advance (parser.c:1150)
==105293==    by 0x54C2B6: parser_parse (parser.c:1348)
==105293==    by 0x53F06F: ts_document_parse_with_options (document.c:136)
==105293==    by 0x53EF4F: ts_document_parse (document.c:107)
2017-07-17 12:34:39 -07:00
Max Brunsfeld
4b40a1ed6c Support anonymous tokens inside of RENAME rules 2017-07-14 10:19:58 -07:00
Max Brunsfeld
8f028ebf68 Avoid deep tree comparison when both trees have errors 2017-07-05 17:33:35 -07:00
Max Brunsfeld
d322f0b6a7 🎨 2017-07-04 21:59:54 -07:00
Max Brunsfeld
a89322c5f1 Remove unneeded parameters from public interface of stack_iterate callback 2017-06-29 16:43:56 -07:00
Max Brunsfeld
66be393b78 Stack - consider empty external token state identical to NULL 2017-06-29 15:00:20 -07:00
Max Brunsfeld
0143bfdad4 Avoid use-after-free of external token states
Previously, it was possible for references to external token states to
outlive the trees to which those states belonged.

Now, instead of storing references to external token states in the Stack
and in the Lexer, we store references to the external token trees
themselves, and we retain the trees to prevent use-after-free.
2017-06-27 14:54:27 -07:00
Max Brunsfeld
f62ee5a0f3 Fix OOB reads at ends of chunks
Signed-off-by: Philip Turnbull <philipturnbull@github.com>
2017-06-23 12:09:16 -07:00
Max Brunsfeld
513edec7c1 Merge pull request #77 from philipturnbull/scan-build-fixes
Fix errors found by scan-build
2017-06-20 10:15:20 -07:00
Max Brunsfeld
c66fddd3aa Add TSInput option to measure columns in bytes not characters 2017-06-15 16:35:34 -07:00
Max Brunsfeld
b862db766e Merge remote-tracking branch 'origin/master' into update-fixture-grammars 2017-06-14 17:11:44 -07:00
Phil Turnbull
18f261ad51 Initialise all fields of TSParseOptions in tests
This should prevent any confusing failures in the unit tests:

test/runtime/document_test.cc:381:7: warning: Passed-by-value struct argument contains uninitialized data (e.g., field: 'changed_range_count')
      ts_document_parse_with_options(document, options);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test/runtime/document_test.cc:408:7: warning: Passed-by-value struct argument contains uninitialized data (e.g., field: 'changed_range_count')
      ts_document_parse_with_options(document, options);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2017-06-14 11:12:06 -04:00
Max Brunsfeld
74f5ceddf7 Fix parsing of valid code with halt_on_error flag set
Signed-off-by: Tim Clem <timothy.clem@gmail.com>
2017-05-01 14:25:25 -07:00
Max Brunsfeld
a98d449d88 Add an option to immediately halt on syntax error 2017-05-01 13:50:49 -07:00
Max Brunsfeld
03a555a86e Finish test for invalid UTF8 handling
Signed-off-by: Tim Clem <timothy.clem@gmail.com>
2017-04-27 14:48:16 -07:00
Timothy Clem
37f2a4745f Test demonstrating non-UT8 input failure 2017-04-27 14:46:36 -07:00
Max Brunsfeld
a15e974150 Make clearer assertions about SpyInput's read strings 2017-03-21 12:14:04 -07:00
Max Brunsfeld
ca943f09a4 Update expected trees in error recovery test 2017-03-21 11:41:01 -07:00
Max Brunsfeld
f032da198e Finish test for invalid UTF8 handling
Signed-off-by: Tim Clem <timothy.clem@gmail.com>
2017-03-21 11:05:32 -07:00
Timothy Clem
7092d4522a Test demonstrating non-UT8 input failure 2017-03-21 09:58:35 -07:00
Max Brunsfeld
42b05b4b5e Add simple unit test for invalidating trees preceding an edit due to lookahead 2017-03-13 17:34:31 -07:00
Max Brunsfeld
d222dbb9fd Allow lexer to accept tokens that ended at previous positions
* Track lookahead in each tree
* Add 'mark_end' API that external scanners can use
2017-03-13 17:06:52 -07:00
Max Brunsfeld
6dc0ff359d Rename spec -> test
'Test' is a lot more straightforward of a name.
2017-03-09 20:40:01 -08:00