Max Brunsfeld
b7d0606fbd
Be less conservative in merging parse states with external tokens
...
Also, clean up the internal representation of external tokens
2018-03-16 16:00:40 -07:00
Max Brunsfeld
8c29841adf
Represent repetitions with associative structure
2018-02-12 11:41:56 -08:00
Max Brunsfeld
2e4f76c164
Don't allow an epsilon start rule if it is used in other rules
2018-01-23 17:05:28 -08:00
Max Brunsfeld
532bbeca0d
Remove wrong handling of \a in a regex
2017-12-12 16:50:53 -08:00
Max Brunsfeld
493db39363
Never move the start rule of a grammar into the lexical grammar
...
This preserves a useful invariant that the root node of the AST is never
a token.
2017-12-07 11:50:27 -08:00
Max Brunsfeld
5f40adb70c
Recur to sub-rules in a deterministic order in expand_repeats
2017-08-08 17:20:04 -07:00
Max Brunsfeld
94dc703bfc
Require that grammars' start rules be visible
2017-08-04 17:07:37 -07:00
Max Brunsfeld
cb5fe80348
Rename RENAME rule to ALIAS, allow it to create anonymous nodes
2017-07-31 16:41:11 -07:00
Max Brunsfeld
b3a72954ff
Introduce RENAME rule type
2017-07-13 17:17:22 -07:00
Max Brunsfeld
d646889922
Simplify flatten_rule function
2017-07-13 09:59:23 -07:00
Max Brunsfeld
65bf1389e1
Add a way to automatically inline rules
2017-07-11 23:13:44 -07:00
Max Brunsfeld
0de93b3bf2
Allow negative dynamic precedences
2017-07-06 22:21:59 -07:00
Max Brunsfeld
d8e9d04fe7
Add PREC_DYNAMIC rule for resolving runtime ambiguities
2017-07-06 15:24:45 -07:00
Max Brunsfeld
ed8fbff175
Allow anonymous tokens to be used in grammars' external token lists
2017-03-17 16:31:29 -07:00
Max Brunsfeld
b3edd8f749
Remove use of shared_ptr in choice, repeat, and seq factories
2017-03-17 14:28:13 -07:00
Max Brunsfeld
416cbb9def
Add missing cassert includes
2017-03-17 13:54:40 -07:00
Max Brunsfeld
db4b9ebc7c
Implement Rule as a union rather than an abstract base class
2017-03-17 13:29:31 -07:00
Max Brunsfeld
c79fae6d21
Clean up extract_tokens function
2017-03-09 21:16:20 -08:00
Max Brunsfeld
abf8a4f2c2
🎨
2017-03-01 22:15:26 -08:00
Max Brunsfeld
686dc0997c
Avoid introducing certain lexical conflicts during parse state merging
...
The current pretty conservative approach is to avoid merging parse states which
would cause a pair tokens to co-exist for the first time in any parse state,
where the two tokens can start with the same character and at least one of the
tokens can contain a character which is part of the grammar's separators.
2017-02-27 22:54:38 -08:00
Timothy Clem
ab00f1b0da
Add support for \W and \D negated character classes too
2017-01-31 15:03:48 -08:00
Timothy Clem
902b7f9745
Allow \S for negated whitespace regex shorthand
2017-01-31 14:45:28 -08:00
Max Brunsfeld
4131e1c16e
Return an error when external token name matches non-terminal rule
2017-01-31 11:36:51 -08:00
Max Brunsfeld
42c41c158c
Refactor logic for handling shared internal/external tokens
2016-12-21 10:49:55 -08:00
Max Brunsfeld
a1770ce844
Allow external tokens to be used as extras
2016-12-12 22:06:01 -08:00
Max Brunsfeld
10b51a05a1
Allow external scanners to refer to (and return) internally-defined tokens
...
Tokens that are defined in the grammar's rules may now be included in the
externals list also, so that external scanners can check if they are valid
lookaheads or not, and if so, can return them to the parser if needed.
2016-12-09 13:32:58 -08:00
Max Brunsfeld
83514293b5
Allow external tokens to be either visible or hidden
2016-12-05 17:26:11 -08:00
Max Brunsfeld
49d25bd0f8
Remove EXTERNAL_TOKEN grammar rule type
2016-12-04 15:02:32 -08:00
Max Brunsfeld
c966af0412
Start work on external tokens
2016-12-02 16:24:19 -08:00
Max Brunsfeld
996ca91e70
Disallow syntax rules that match the empty string (for now)
2016-11-30 23:19:54 -08:00
Max Brunsfeld
6cf4ccb840
Represent rule metadata as a struct, not a map
2016-11-19 13:59:34 -08:00
Max Brunsfeld
32387400c6
Rework LR conflict resolution
...
* Unify precedence/associativity-based resolution with the
search for a whitelisted conflict
* Improve conflict error messages
2016-11-18 13:50:55 -08:00
Max Brunsfeld
7bcae8f6a8
🎨 flatten_grammar
2016-11-09 20:29:21 -08:00
Max Brunsfeld
cf19b2e58d
Make repeat rules left-recursive instead of right recursive
2016-04-18 12:40:14 -07:00
Max Brunsfeld
d4632ab9a9
Make the compile function plain C and take a JSON grammar
2016-01-11 12:33:48 -08:00
Max Brunsfeld
36870bfced
Make Grammar a simple struct
2016-01-08 15:51:30 -08:00
Max Brunsfeld
1c6ad5f7e4
Rename ubiquitous_tokens -> extra_tokens in compiler API
...
They were already called this in the runtime code.
'Extra' is just easier to say.
2015-12-17 15:50:50 -08:00
Max Brunsfeld
c495076adb
Record in parse table which actions can hide splits
...
Suppose a parse state S has multiple actions for a terminal lookahead symbol A.
Then during incremental parsing, while in state S, the parser should not
reuse a non-terminal lookahead B where FIRST(B) contains A, because reusing B
might prematurely discard one of the possible actions that a batch parser
would have attempted in state S, upon seeing A as a lookahead.
2015-12-17 13:11:56 -08:00
Max Brunsfeld
75f31a79a3
Treat reduce actions with different production IDs as distinct
2015-12-10 13:00:26 -08:00
Max Brunsfeld
53424699e4
Comment all the steps of prepare_grammar
2015-12-02 14:56:59 -08:00
Max Brunsfeld
d5ce268074
Fix handling of changing precedence within lexical rules.
...
A precedence annotation wrapping a sequence of characters now only affects how
tightly those characters bind to *each other*, not how tightly they bind to the
preceding character.
This bug surfaced because a generated lexer was failing to recognize a '\n' character
as a token, instead treating it as ubiquitous whitespace. It made this error
because, even though anonymous ubiquitous tokens have the lowest precedence, the
character immediately *after* the '\n' was part of a normal token, which had
*normal* precedence (0). Advancing into that following token was incorrectly
prioritized above accepting the line-break token.
2015-11-08 13:36:15 -08:00
Max Brunsfeld
d6ee28abd0
Make precedence more useful within tokens
...
Choose accept-token actions over advance actions if their rule has a higher precedence.
2015-11-01 12:48:27 -08:00
Max Brunsfeld
b61b27f22f
Handle inline ubiquitous that are used elsewhere in the grammar
2015-10-26 17:19:37 -07:00
Max Brunsfeld
9959fe35b0
Allow associativity to be specified in rules w/o precedence
2015-10-13 11:25:28 -07:00
Max Brunsfeld
4b817dc07c
Fix linter errors
2015-10-12 19:22:05 -07:00
Max Brunsfeld
82726ad53b
Define repeat rule in terms of repeat1 rule
2015-10-12 19:22:05 -07:00
Max Brunsfeld
5c67f58a4b
Add helper for dynamic-casting to rule subclasses
2015-10-12 19:21:56 -07:00
Max Brunsfeld
db9966b57c
Simplify lex item set transitions code
2015-10-11 22:51:37 -07:00
Max Brunsfeld
25791085c3
Normalize lexical grammar rules before constructing lex table
2015-10-11 16:56:00 -07:00
Max Brunsfeld
5e4bdcbaf8
Simplify handling of precedence & associativity in productions
2015-10-05 16:56:11 -07:00