Max Brunsfeld
26f9e22193
Clean up parser code
2014-10-08 16:51:04 -07:00
Max Brunsfeld
af7f57a80e
Fix sizing of error nodes after edits
2014-10-05 16:56:50 -07:00
Max Brunsfeld
6d37877e49
Tweak debugging output
2014-10-05 16:56:29 -07:00
Max Brunsfeld
e5ea4efb0b
Use stdbool.h
2014-10-03 16:06:08 -07:00
Max Brunsfeld
808b003f1a
Read unicode characters correctly in Lexer advance
2014-10-03 15:44:49 -07:00
Max Brunsfeld
78c5fe8e02
clang-format
2014-10-03 15:44:21 -07:00
Max Brunsfeld
5dd8778996
Clean up Parser handle_error function
2014-09-29 10:33:56 -07:00
Max Brunsfeld
10a3251fbe
Remove index parameter from STACK_FROM_TOP macro
2014-09-29 10:16:24 -07:00
Max Brunsfeld
070dc76050
Generate correct C literals for non-ascii characters
2014-09-28 18:40:15 -07:00
Max Brunsfeld
cb5ecbd491
Handle string and regex rules w/ non-ascii chars
2014-09-28 18:21:22 -07:00
Max Brunsfeld
e0185f84fc
Print non-ascii characters as numbers in CharacterRange::to_string
2014-09-28 18:19:42 -07:00
Max Brunsfeld
26ac5788b6
Don't use struct literal syntax for TSLength
2014-09-26 16:31:36 -07:00
Max Brunsfeld
c1565c1aae
Track AST nodes' sizes in characters as well as bytes
...
The `pos` and `size` functions for Nodes now return TSLength structs,
which contain lengths in both characters and bytes. This is important
for knowing the number of unicode characters in a Node.
2014-09-26 16:15:07 -07:00
Max Brunsfeld
141cbcfa02
Read unicode characters using utf8proc
2014-09-13 00:24:10 -07:00
Max Brunsfeld
e23f11b7c4
Allow lexical debug mode to be enabled on documents
...
- `ts_document_set_debug(doc, 1)` implies parse debug mode
- `ts_document_set_debug(doc, > 1)` implies parse and lex debug mode
2014-09-11 13:12:06 -07:00
Max Brunsfeld
68d6e242ee
Fix parsing of wildcard patterns at the ends of documents
...
- Remove special EOF handling from lexer
- Explicitly exclude the EOF character from all-inclusive character sets.
2014-09-11 13:10:23 -07:00
Max Brunsfeld
209992c832
Remove trailing whitespace
2014-09-10 13:19:45 -07:00
Max Brunsfeld
9a93f6bdef
Clean up prepare_grammar function
2014-09-10 13:02:31 -07:00
Max Brunsfeld
cd8a683229
Improve error messages for invalid ubiquitous tokens
2014-09-10 13:02:16 -07:00
Max Brunsfeld
2e7ffb4d14
Tweak auto-format settings
...
Prefer lines that exceed 80 characters by a small margin to
line breaks in argument lists
2014-09-09 13:15:40 -07:00
Max Brunsfeld
8f109504a8
Clean up extract_tokens function
2014-09-09 12:57:29 -07:00
Max Brunsfeld
9ee0665fad
Remove unused code in extract_tokens.cc
2014-09-09 12:34:15 -07:00
Max Brunsfeld
e181426f6f
Use make_tuple rather than init list syntax for gcc
2014-09-07 22:58:45 -07:00
Max Brunsfeld
1ff7cedf40
Unify ubiquitous tokens and lexical separators in API
2014-09-07 22:16:45 -07:00
Max Brunsfeld
a46f9d950c
Handle '\s' correctly in regexps
2014-09-07 16:05:43 -07:00
Max Brunsfeld
2a9f51790f
Move is_token function to its own file
2014-09-07 13:49:44 -07:00
Max Brunsfeld
ed11ef557a
Fix expansion of repeat rules into recursive rules
...
Previously, the way repeat rules were expanded, the auxiliary
rule always needed to be reduced, even if the repeating content
was empty. This caused problems in parse states where some items
contained the repeat rule and some did not. To make those cases
work, the repeat rule had to explicitly be marked as optional.
With this change, that is no longer necessary.
2014-09-07 09:39:14 -07:00
Max Brunsfeld
43ecac2a1d
Expose debug flag on document
2014-09-06 17:56:00 -07:00
Max Brunsfeld
c0a3f8d39c
Remove some macros from public parser header
2014-09-05 23:47:38 -07:00
Max Brunsfeld
d3204d3526
Include '_' in '\w' regex character class
2014-09-05 18:41:12 -07:00
Max Brunsfeld
8512af712e
Add debug log when re-lexing during error handling
2014-09-05 18:38:17 -07:00
Max Brunsfeld
6cf267efaf
Clean up breakdown stack function
2014-09-03 22:35:52 -07:00
Max Brunsfeld
9c0b5b5571
clang-format
2014-09-03 18:53:38 -07:00
Max Brunsfeld
3dea1261a6
Clean up document specs for incremental parsing
2014-09-03 18:48:10 -07:00
Max Brunsfeld
c72445d808
Fix inc parsing for nodes containing ubiq tokens
2014-09-03 13:17:06 -07:00
Max Brunsfeld
ad52bdc448
Fix inc parsing when appending to end of a token
2014-09-03 07:09:15 -07:00
Max Brunsfeld
cc5f1471a8
Add debug lines for breaking down stack when re-parsing
2014-09-02 22:16:17 -07:00
Max Brunsfeld
545e575508
Revert "Remove the separator characters construct"
...
This reverts commit 5cd07648fd .
The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.
Conflicts:
src/compiler/build_tables/build_lex_table.cc
src/compiler/grammar.cc
src/runtime/parser.c
2014-09-02 08:03:51 -07:00
Max Brunsfeld
e941f8c175
Fix error in document editing
...
When breaking down the stack in parser.c, the previous code
would not account for ubiquitous tokens. This was a problem
for a long time, but wasn't noticed until ubiquitous tokens
started being used to represent separator characters
2014-09-01 21:32:29 -07:00
Max Brunsfeld
5cd07648fd
Remove the separator characters construct
...
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.
For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
2014-09-01 20:19:43 -07:00
Max Brunsfeld
db295cebbc
Suppress unused variable warning in stack iteration macro
2014-09-01 14:16:27 -07:00
Max Brunsfeld
d38f095f01
Clean up Tree code
2014-09-01 14:08:07 -07:00
Max Brunsfeld
88d07c8960
Clean up parse table lookup function
2014-08-31 21:17:32 -07:00
Max Brunsfeld
2985a98150
Build error nodes in lexer again, not in parser
2014-08-31 16:59:01 -07:00
Max Brunsfeld
16d5cf1d04
Remove expected symbols from error nodes
2014-08-31 16:39:16 -07:00
Max Brunsfeld
25a254a732
Comment and format
2014-08-31 16:24:27 -07:00
Max Brunsfeld
85d8c9df5c
Handle multiple ubiquitous in a row
2014-08-31 12:11:16 -07:00
Max Brunsfeld
e6bbab41e5
Realloc parse stack when it grows to its capacity
2014-08-30 21:39:55 -07:00
Max Brunsfeld
c5ac02c571
Fix size calculation for error nodes
2014-08-29 13:22:03 -07:00
Max Brunsfeld
604b149c4b
Assign sizes to error nodes in handle_error
2014-08-28 18:35:30 -07:00