Commit graph

205 commits

Author SHA1 Message Date
Max Brunsfeld
ae0a7fc97d Add logging debugger for debugging failing tests 2014-10-17 23:05:08 -07:00
Max Brunsfeld
41a067fef9 Fix build warnings in document spec 2014-10-17 21:27:49 -07:00
Max Brunsfeld
8cf800ef5d Unify debugging API for parsing and lexing 2014-10-17 17:52:54 -07:00
Max Brunsfeld
22ee68e1a9 Make node for each var assignment in JS grammar 2014-10-15 15:04:57 -07:00
Max Brunsfeld
d3137c6ac6 Organize parser spec 2014-10-14 23:23:12 -07:00
Max Brunsfeld
d33b074c30 Don't call input::seek_fn unnecessarily 2014-10-14 22:56:42 -07:00
Max Brunsfeld
b5d022a70c Fix missing field warnings for debugger structs 2014-10-14 22:50:24 -07:00
Max Brunsfeld
c594208ab8 Allow callbacks to be specified for debug output 2014-10-13 01:02:18 -07:00
Max Brunsfeld
fb38140317 Discard portion of right subtree that is within the edited region 2014-10-12 11:51:12 -07:00
Max Brunsfeld
f460b921e2 Fix off-by-one error in storing reusable right-subtree 2014-10-10 12:10:23 -07:00
Max Brunsfeld
4dcc712a8c Start work on re-using right side of parse tree 2014-10-09 19:58:15 -07:00
Max Brunsfeld
af7f57a80e Fix sizing of error nodes after edits 2014-10-05 16:56:50 -07:00
Max Brunsfeld
e5ea4efb0b Use stdbool.h 2014-10-03 16:06:08 -07:00
Max Brunsfeld
808b003f1a Read unicode characters correctly in Lexer advance 2014-10-03 15:44:49 -07:00
Max Brunsfeld
1fa3bf0f07 In SpyReader::read, always return complete unicode characters 2014-10-03 14:30:19 -07:00
Max Brunsfeld
17f43e5e0c Clean up SpyReader 2014-10-03 14:21:39 -07:00
Max Brunsfeld
a69dfa08f3 Add spec for inserting text w/ unicode characters 2014-10-02 11:54:00 -07:00
Max Brunsfeld
8bee9d8fb9 Fix typo in parser spec descriptions 2014-10-02 11:52:58 -07:00
Max Brunsfeld
0f524121f1 Add SpyReader methods for inserting/removing by char index 2014-10-02 11:43:22 -07:00
Max Brunsfeld
5f313896c3 Make ::input a method on SpyReader, not a field 2014-09-30 14:57:57 -07:00
Max Brunsfeld
8d7d9af661 Remove unnecessary helper function in parser spec 2014-09-29 10:51:12 -07:00
Max Brunsfeld
700919951e Reorganize parser spec about handling edits 2014-09-29 10:48:35 -07:00
Max Brunsfeld
7988829c08 Add spec for recognition of UTF8 characters 2014-09-27 16:00:48 -07:00
Max Brunsfeld
26ac5788b6 Don't use struct literal syntax for TSLength 2014-09-26 16:31:36 -07:00
Max Brunsfeld
04dc721241 Add missing import for string.h 2014-09-26 16:21:09 -07:00
Max Brunsfeld
c1565c1aae Track AST nodes' sizes in characters as well as bytes
The `pos` and `size` functions for Nodes now return TSLength structs,
which contain lengths in both characters and bytes. This is important
for knowing the number of unicode characters in a Node.
2014-09-26 16:15:07 -07:00
Max Brunsfeld
c576d7d4a0 In SpyReader, don't return pointers into main content string
This improves test coverage of the lexer. Before, a SpyReader's read function
would return pointers into a single string that contained the entire text. This
could have masked bugs where out-of-bounds characters were being read.
Now the chunks returned by the reader are copied into a separate buffer.
2014-09-26 16:12:52 -07:00
Max Brunsfeld
68d6e242ee Fix parsing of wildcard patterns at the ends of documents
- Remove special EOF handling from lexer
- Explicitly exclude the EOF character from all-inclusive character sets.
2014-09-11 13:10:23 -07:00
Max Brunsfeld
f05762b4a0 Move parser tests into their own file 2014-09-10 18:49:53 -07:00
Max Brunsfeld
1ff7cedf40 Unify ubiquitous tokens and lexical separators in API 2014-09-07 22:16:45 -07:00
Max Brunsfeld
ed11ef557a Fix expansion of repeat rules into recursive rules
Previously, the way repeat rules were expanded, the auxiliary
rule always needed to be reduced, even if the repeating content
was empty. This caused problems in parse states where some items
contained the repeat rule and some did not. To make those cases
work, the repeat rule had to explicitly be marked as optional.
With this change, that is no longer necessary.
2014-09-07 09:39:14 -07:00
Max Brunsfeld
43ecac2a1d Expose debug flag on document 2014-09-06 17:56:00 -07:00
Max Brunsfeld
3dea1261a6 Clean up document specs for incremental parsing 2014-09-03 18:48:10 -07:00
Max Brunsfeld
c72445d808 Fix inc parsing for nodes containing ubiq tokens 2014-09-03 13:17:06 -07:00
Max Brunsfeld
ad52bdc448 Fix inc parsing when appending to end of a token 2014-09-03 07:09:15 -07:00
Max Brunsfeld
77529ace3d Fix infinite loop in certain cases w/ unterminated tokens 2014-09-03 00:38:44 -07:00
Max Brunsfeld
7d81126df3 Remove unnecessary import of public header in specs 2014-09-02 22:17:04 -07:00
Max Brunsfeld
545e575508 Revert "Remove the separator characters construct"
This reverts commit 5cd07648fd.

The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.

Conflicts:
	src/compiler/build_tables/build_lex_table.cc
	src/compiler/grammar.cc
	src/runtime/parser.c
2014-09-02 08:03:51 -07:00
Max Brunsfeld
e941f8c175 Fix error in document editing
When breaking down the stack in parser.c, the previous code
would not account for ubiquitous tokens. This was a problem
for a long time, but wasn't noticed until ubiquitous tokens
started being used to represent separator characters
2014-09-01 21:32:29 -07:00
Max Brunsfeld
5cd07648fd Remove the separator characters construct
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.

For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
2014-09-01 20:19:43 -07:00
Max Brunsfeld
2985a98150 Build error nodes in lexer again, not in parser 2014-08-31 16:59:01 -07:00
Max Brunsfeld
85d8c9df5c Handle multiple ubiquitous in a row 2014-08-31 12:11:16 -07:00
Max Brunsfeld
a75686b017 Fix double release calls in document spec 2014-08-31 00:46:09 -07:00
Max Brunsfeld
c5ac02c571 Fix size calculation for error nodes 2014-08-29 13:22:03 -07:00
Max Brunsfeld
604b149c4b Assign sizes to error nodes in handle_error 2014-08-28 18:35:30 -07:00
Max Brunsfeld
3430a5edcc Clarify distinction btwn tree padding, tree offset, node position
- Node position is public. It represents the node's first character
  index in the document.
- Tree offset is private. It represents the distance between the tree's
  first character index and it's parent's first character index.
- Tree padding is private. It represents the amount of whitespace
  (or other separator characters) immediately preceding the tree.
2014-08-28 13:22:06 -07:00
Max Brunsfeld
b91f48ced2 Call handle_error even when error occurs exactly where expected
Previously, if an error happened right at the beginning of an error
production, the error node would be immediately shifted onto the stack
without calling the error handling function.
2014-08-27 18:44:27 -07:00
Max Brunsfeld
7b0a52ec26 Pretty-print single hidden tree nodes correctly 2014-08-27 12:56:36 -07:00
Max Brunsfeld
77941c85ff Avoid building incomplete error nodes during lexing
The lexer doesn't know the expected symbols, so it doesn't have enough
information to construct error nodes. Now, when it encounters an invalid
character, it returns NULL and the parser builds a correct error node.
2014-08-25 23:35:00 -07:00
Max Brunsfeld
117869e49a Fix position calculation in node_find_for_range 2014-08-25 15:52:17 -07:00