Commit graph

311 commits

Author SHA1 Message Date
Max Brunsfeld
8ac4b9fc17 Store productions' end rule ids in the vector 2015-02-16 22:11:03 -08:00
Max Brunsfeld
1ba8701ada Compute fewer item set closures in item set transitions function 2015-02-16 22:11:03 -08:00
Max Brunsfeld
52daffb3f3 Separate syntax rules into flat lists of symbols
This way, every ParseItem can be associated with a particular production
for its non-terminal. That lets us keep track of which productions are
involved in shift/reduce conflicts.
2015-02-16 22:11:03 -08:00
Max Brunsfeld
68a0e16d1e Add void specialization of RuleFn template 2015-02-16 22:11:03 -08:00
Max Brunsfeld
074a7884aa Fix uninitialized variable error 2015-02-16 22:09:30 -08:00
Max Brunsfeld
109b5616d3 Remove unused arg to action_takes_precedence 2015-01-17 14:14:49 -08:00
Max Brunsfeld
160fca6579 Refactor avoidance of redundant repeat rules 2015-01-14 21:11:19 -08:00
Max Brunsfeld
a0d9da9d5c Rename static 'Build' methods to 'build' 2015-01-14 21:11:05 -08:00
Max Brunsfeld
34d96909d1 Move {Syntax,Lexical}Grammar into separate files 2015-01-14 21:10:41 -08:00
Max Brunsfeld
0d267e41aa Replace LexConflictManager class with action_takes_precedence function 2015-01-11 19:50:58 -08:00
Max Brunsfeld
5dc08ccce9 Include names of in-progress rules in shift/reduce conflicts 2014-11-05 18:39:50 -08:00
Max Brunsfeld
fc83322832 Handle more special characters in c symbol names 2014-10-31 18:37:56 -07:00
Max Brunsfeld
4fc960e4ec Give extracted string/regex tokens more descriptive names 2014-10-31 08:56:58 -07:00
Max Brunsfeld
ef2084d3c8 Tweak parse debugging 2014-10-13 21:20:08 -07:00
Max Brunsfeld
71cc7a2dc2 Tidy up remaining files in build_tables namespace 2014-10-13 01:02:18 -07:00
Max Brunsfeld
9fd2821389 Fix TODO comment in build_lex_table 2014-10-12 13:53:31 -07:00
Max Brunsfeld
6415690738 Tidy up get_metadata function 2014-10-12 13:04:11 -07:00
Max Brunsfeld
b23caf366f Tidy up first_symbols function 2014-10-12 13:02:39 -07:00
Max Brunsfeld
faecdcbb2f Tidy up build_tables function 2014-10-12 12:57:46 -07:00
Max Brunsfeld
8379d9387c Tidy up build_parse_table function 2014-10-12 12:56:04 -07:00
Max Brunsfeld
1fb52eacab Tidy up build_lex_table function 2014-10-12 12:44:16 -07:00
Max Brunsfeld
0daada6921 Tidy up c-code generator 2014-10-12 12:33:45 -07:00
Max Brunsfeld
5b624d37f6 Use consistent iteration in c code generator 2014-10-12 11:57:52 -07:00
Max Brunsfeld
aae6f6de14 Remove whitespace between template closing tags 2014-10-12 11:51:12 -07:00
Max Brunsfeld
3bcb221379 Include non-terminal lookahead symbols for reduction actions
This is necessary for re-using the right subtree after an edit
2014-10-10 12:06:16 -07:00
Max Brunsfeld
78c5fe8e02 clang-format 2014-10-03 15:44:21 -07:00
Max Brunsfeld
070dc76050 Generate correct C literals for non-ascii characters 2014-09-28 18:40:15 -07:00
Max Brunsfeld
cb5ecbd491 Handle string and regex rules w/ non-ascii chars 2014-09-28 18:21:22 -07:00
Max Brunsfeld
e0185f84fc Print non-ascii characters as numbers in CharacterRange::to_string 2014-09-28 18:19:42 -07:00
Max Brunsfeld
68d6e242ee Fix parsing of wildcard patterns at the ends of documents
- Remove special EOF handling from lexer
- Explicitly exclude the EOF character from all-inclusive character sets.
2014-09-11 13:10:23 -07:00
Max Brunsfeld
209992c832 Remove trailing whitespace 2014-09-10 13:19:45 -07:00
Max Brunsfeld
9a93f6bdef Clean up prepare_grammar function 2014-09-10 13:02:31 -07:00
Max Brunsfeld
cd8a683229 Improve error messages for invalid ubiquitous tokens 2014-09-10 13:02:16 -07:00
Max Brunsfeld
2e7ffb4d14 Tweak auto-format settings
Prefer lines that exceed 80 characters by a small margin to
line breaks in argument lists
2014-09-09 13:15:40 -07:00
Max Brunsfeld
8f109504a8 Clean up extract_tokens function 2014-09-09 12:57:29 -07:00
Max Brunsfeld
9ee0665fad Remove unused code in extract_tokens.cc 2014-09-09 12:34:15 -07:00
Max Brunsfeld
e181426f6f Use make_tuple rather than init list syntax for gcc 2014-09-07 22:58:45 -07:00
Max Brunsfeld
1ff7cedf40 Unify ubiquitous tokens and lexical separators in API 2014-09-07 22:16:45 -07:00
Max Brunsfeld
a46f9d950c Handle '\s' correctly in regexps 2014-09-07 16:05:43 -07:00
Max Brunsfeld
2a9f51790f Move is_token function to its own file 2014-09-07 13:49:44 -07:00
Max Brunsfeld
ed11ef557a Fix expansion of repeat rules into recursive rules
Previously, the way repeat rules were expanded, the auxiliary
rule always needed to be reduced, even if the repeating content
was empty. This caused problems in parse states where some items
contained the repeat rule and some did not. To make those cases
work, the repeat rule had to explicitly be marked as optional.
With this change, that is no longer necessary.
2014-09-07 09:39:14 -07:00
Max Brunsfeld
c0a3f8d39c Remove some macros from public parser header 2014-09-05 23:47:38 -07:00
Max Brunsfeld
d3204d3526 Include '_' in '\w' regex character class 2014-09-05 18:41:12 -07:00
Max Brunsfeld
545e575508 Revert "Remove the separator characters construct"
This reverts commit 5cd07648fd.

The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.

Conflicts:
	src/compiler/build_tables/build_lex_table.cc
	src/compiler/grammar.cc
	src/runtime/parser.c
2014-09-02 08:03:51 -07:00
Max Brunsfeld
e941f8c175 Fix error in document editing
When breaking down the stack in parser.c, the previous code
would not account for ubiquitous tokens. This was a problem
for a long time, but wasn't noticed until ubiquitous tokens
started being used to represent separator characters
2014-09-01 21:32:29 -07:00
Max Brunsfeld
5cd07648fd Remove the separator characters construct
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.

For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
2014-09-01 20:19:43 -07:00
Max Brunsfeld
346cf4fe5d Remove LEX_PANIC macro 2014-08-26 13:12:12 -07:00
Max Brunsfeld
8f4939a3d3 unsigned char -> uint32_t in CharacterRange 2014-08-24 01:05:59 -07:00
Max Brunsfeld
9338249075 Remove implicit CharacterRange constructors
Also fix misc smaller lint errors
2014-08-23 14:52:44 -07:00
Max Brunsfeld
0bb5663f0f Refactor - represent char sets in terms of inclusions and exclusions 2014-08-23 14:25:45 -07:00