Max Brunsfeld
8ac4b9fc17
Store productions' end rule ids in the vector
2015-02-16 22:11:03 -08:00
Max Brunsfeld
68a0e16d1e
Add void specialization of RuleFn template
2015-02-16 22:11:03 -08:00
Max Brunsfeld
160fca6579
Refactor avoidance of redundant repeat rules
2015-01-14 21:11:19 -08:00
Max Brunsfeld
a0d9da9d5c
Rename static 'Build' methods to 'build'
2015-01-14 21:11:05 -08:00
Max Brunsfeld
aae6f6de14
Remove whitespace between template closing tags
2014-10-12 11:51:12 -07:00
Max Brunsfeld
070dc76050
Generate correct C literals for non-ascii characters
2014-09-28 18:40:15 -07:00
Max Brunsfeld
e0185f84fc
Print non-ascii characters as numbers in CharacterRange::to_string
2014-09-28 18:19:42 -07:00
Max Brunsfeld
68d6e242ee
Fix parsing of wildcard patterns at the ends of documents
...
- Remove special EOF handling from lexer
- Explicitly exclude the EOF character from all-inclusive character sets.
2014-09-11 13:10:23 -07:00
Max Brunsfeld
2e7ffb4d14
Tweak auto-format settings
...
Prefer lines that exceed 80 characters by a small margin to
line breaks in argument lists
2014-09-09 13:15:40 -07:00
Max Brunsfeld
1ff7cedf40
Unify ubiquitous tokens and lexical separators in API
2014-09-07 22:16:45 -07:00
Max Brunsfeld
545e575508
Revert "Remove the separator characters construct"
...
This reverts commit 5cd07648fd .
The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.
Conflicts:
src/compiler/build_tables/build_lex_table.cc
src/compiler/grammar.cc
src/runtime/parser.c
2014-09-02 08:03:51 -07:00
Max Brunsfeld
5cd07648fd
Remove the separator characters construct
...
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.
For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
2014-09-01 20:19:43 -07:00
Max Brunsfeld
8f4939a3d3
unsigned char -> uint32_t in CharacterRange
2014-08-24 01:05:59 -07:00
Max Brunsfeld
9338249075
Remove implicit CharacterRange constructors
...
Also fix misc smaller lint errors
2014-08-23 14:52:44 -07:00
Max Brunsfeld
0bb5663f0f
Refactor - represent char sets in terms of inclusions and exclusions
2014-08-23 14:25:45 -07:00
Max Brunsfeld
6f374fddff
Tweak format for pretty printing some classes
2014-08-23 13:52:00 -07:00
Max Brunsfeld
2963b08f79
Eliminate duplicates when constructing choice rules
2014-08-21 20:04:42 -07:00
Max Brunsfeld
1e79ed794b
Allow multiple top-level nodes
...
Now, the root node of a document is always a document node.
It will often have only one child node which corresponds to the grammar's
start symbol, but not always. Currently, it may have more than one child
if there are ubiquitous tokens such as comments at the beginning of the
document. In the future, it will also be possible be possible to have multiple
for the document to have multiple children if the document is partially parsed.
2014-08-09 00:00:20 -07:00
Max Brunsfeld
41c4e7cd8f
Fix more namespace formatting
2014-08-08 08:35:26 -07:00
Max Brunsfeld
98cc2f2264
Auto-format all source code with clang-format
2014-07-21 13:20:00 -07:00
Max Brunsfeld
2c382b7363
Trim trailing whitespace
2014-06-16 21:33:35 -07:00
Max Brunsfeld
7a2c2c1c90
Store ParseItemSets as maps, w/ core items as keys
...
ParseItem no longer has a lookahead_sym field; it now represents
the 'core' of a parse item. The lookahead context is stored separately,
as a set per core item. This makes iterating, copying and merging item
sets more efficient, because before, the core items were repeated for each
different lookahead symbol.
Also, the memoization in sym_transitions(ParseItemSet) has been removed.
Maybe I'll add it back later.
2014-06-16 08:35:20 -07:00
Max Brunsfeld
3cd031af38
Add keypattern rule helper
...
This way, pattern rules (e.g. golang's comment) can be easily given the
same precedence as keyword rules.
2014-06-11 12:40:49 -07:00
Max Brunsfeld
649f200831
Expand regex/string rules as part of grammar preparation
...
This makes it possible to report errors in regex parsing
2014-05-19 20:54:59 -07:00
Max Brunsfeld
3a50171249
Expose all grammar compilation errors
2014-05-01 23:28:40 -07:00
Max Brunsfeld
93620b3ed1
Add keyword helper for making higher-priority string tokens
2014-05-01 13:25:20 -07:00
Max Brunsfeld
6d40dcf881
Add token helper for building token rules
...
Now you can specify the structure of tokens using
all of the rule functions, not just `str` and `pattern`
2014-05-01 12:43:29 -07:00
Max Brunsfeld
0d763d229d
cpplint
2014-04-28 21:46:43 -07:00
Max Brunsfeld
25eda9d889
ISymbol -> Symbol
...
Interned symbols are now the main type of symbol in use
2014-04-28 20:43:27 -07:00
Max Brunsfeld
faf80aadac
Symbol -> NamedSymbol
2014-04-28 20:15:49 -07:00
Max Brunsfeld
6ea4e6b2b0
Give rules::Visitor a virtual destructor
2014-04-27 23:19:11 -07:00
Max Brunsfeld
b2cb78166e
Give Rule a virtual destructor
...
Not needed at the moment because rule pointers are are always wrapped
in shared_ptrs. Still, don't want to forget this if I stopped using shared_ptrs
at some point.
2014-04-27 21:49:56 -07:00
Max Brunsfeld
29bbff655c
Store choice rules using vectors, not pairs
2014-04-26 23:21:09 -07:00
Max Brunsfeld
93df5579b4
Trim whitespace
2014-04-25 22:17:23 -07:00
Max Brunsfeld
c2abfd2d03
Parse '.' in regexes
2014-04-24 13:21:46 -07:00
Max Brunsfeld
020614824a
Avoid unnecessary dynamic cast in symbol equality function
2014-04-23 13:12:39 -07:00
Max Brunsfeld
3b388d66cd
Profile and optimize
...
- Eliminate unnecessary copies of grammar objects
- Do cheaper comparisons first in equality methods
2014-04-23 08:32:11 -07:00
Max Brunsfeld
68d44fd565
Intern symbols during grammar preparation
2014-04-22 23:38:26 -07:00
Max Brunsfeld
a437d39773
Add rule precedence construct
...
Still need to add some way of expressing left and right
associativity
2014-04-15 08:40:46 -07:00
Max Brunsfeld
67243c7e2f
cpplint
2014-04-14 08:38:44 -07:00
Max Brunsfeld
a5816a9624
Refactor rule visitors
2014-04-09 13:28:02 -07:00
Max Brunsfeld
6a0a28f4b3
WIP - try to fix travis build
2014-04-08 21:41:38 -07:00
Max Brunsfeld
1da9f1fdfd
Store rule metadata as a map, not a single number
...
Need to store more than just boolean values
2014-04-07 08:50:00 -07:00
Max Brunsfeld
5320cad065
Trim trailing whitespace
2014-04-04 13:10:55 -07:00
Max Brunsfeld
1cc7e32e2d
Fix handling of tokens consisting of separator characters
...
The parser is no longer hard-coded to skip whitespace. Tokens
such as newlines, whose characters overlap with the separator
characters, can now be correctly recognized.
2014-04-03 19:10:09 -07:00
Max Brunsfeld
f39cb1890d
Refactor rule visitor objects
2014-04-01 13:38:02 -07:00
Max Brunsfeld
2a222adb7e
Represent character sets with unsigned chars
...
This is better for comparing character ranges, since
there is a definite maximum character value.
2014-03-31 18:47:18 -07:00
Max Brunsfeld
7824b3191b
Fix bug in character set difference calculation
2014-03-31 18:38:54 -07:00
Max Brunsfeld
7adb0bf34f
Add golang example grammar
...
Also, support '\a' character class shorthand in regexes,
for alphabetical characters
2014-03-29 16:29:34 -07:00
Max Brunsfeld
13c4e6e648
Tweak format for example grammars
2014-03-28 13:51:32 -07:00