Max Brunsfeld
996ca91e70
Disallow syntax rules that match the empty string (for now)
2016-11-30 23:19:54 -08:00
Max Brunsfeld
101e304a8a
Avoid unnecessary lookahead set mutations in ParseItemSetBuilder
2016-11-20 21:41:36 -08:00
Max Brunsfeld
06215607d1
Precompute transitive closure contributions by grammar symbol
2016-11-20 11:49:55 -08:00
Max Brunsfeld
5332fd3418
Fix build warnings
2016-11-19 20:47:43 -08:00
Max Brunsfeld
6cf4ccb840
Represent rule metadata as a struct, not a map
2016-11-19 13:59:34 -08:00
Max Brunsfeld
cab1bd3ac5
Make conflict messages explicit about precedence combinations
2016-11-18 17:05:16 -08:00
Max Brunsfeld
5924285e69
🎨
2016-11-18 16:14:05 -08:00
Max Brunsfeld
32387400c6
Rework LR conflict resolution
...
* Unify precedence/associativity-based resolution with the
search for a whitelisted conflict
* Improve conflict error messages
2016-11-18 13:50:55 -08:00
Max Brunsfeld
6935f1d26f
Use hash_combine everywhere
2016-11-16 11:46:22 -08:00
Max Brunsfeld
6cfd009503
Compute parse state group signature based on the item set
2016-11-16 10:21:30 -08:00
Max Brunsfeld
42d37656ea
Optimize remove_duplicate_parse_states method
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-11-15 17:51:52 -08:00
Max Brunsfeld
1118a9142a
Introduce Symbol::Index type alias
2016-11-14 10:25:26 -08:00
Max Brunsfeld
a89f8c086b
Remove stray #include
2016-11-14 09:31:32 -08:00
Max Brunsfeld
fad7294ba4
Store shift states for non-terminals directly in the main parse table
2016-11-14 08:36:06 -08:00
Max Brunsfeld
8d9c261e3a
Don't include reduce actions for nonterminal lookaheads
2016-11-10 11:33:37 -08:00
Max Brunsfeld
255bc2427c
🎨 build_parse_table
2016-11-09 20:47:47 -08:00
Max Brunsfeld
7bcae8f6a8
🎨 flatten_grammar
2016-11-09 20:29:21 -08:00
Max Brunsfeld
3b3fddd64d
Relax overly conservative parse state mergeability check
...
Built-in symbols (e.g. EOF, ERROR) should not prevent parse states from being
merged. Neither should non-token productions.
2016-10-26 21:58:15 -07:00
Timothy Clem
693c6d40dd
Move setup of mergeable_symbols to constructor, use set throughout
2016-10-18 15:18:33 -07:00
Timothy Clem
14bae584d4
WIP: New check for mergable symbols in merge_state
2016-10-18 13:03:41 -07:00
Max Brunsfeld
e149d94ff5
Remove generated parsers' dependency on runtime.h
2016-10-05 14:02:49 -07:00
Max Brunsfeld
b76574e01c
Handle ambiguities between extra and non-extra tokens using normal GLR splitting
2016-09-06 10:22:16 -07:00
Max Brunsfeld
1c52c30111
Fix unexpected EOF errors getting lost
2016-09-03 22:46:14 -07:00
Max Brunsfeld
88e8cab7f9
Remove all mention of the ERROR rule type
2016-09-01 16:34:44 -07:00
Max Brunsfeld
4182de2975
Include each symbol's numeric value in generated code
...
Sometimes these are useful for debugging
2016-08-26 17:40:22 -07:00
Max Brunsfeld
0ee1994078
Don't have both shift and shift-extra actions in recovery states
2016-07-17 13:35:58 -07:00
Max Brunsfeld
1c66d90203
Mark repeat symbols as anonymous
2016-07-17 10:44:08 -07:00
Max Brunsfeld
fa8993460e
Don't reuse unexpected tokens for now
2016-07-17 07:25:13 -07:00
Max Brunsfeld
0e2bbbd7ee
Compress parse table by allowing reductions w/ unexpected lookaheads
2016-07-04 12:20:23 -07:00
Max Brunsfeld
8c26d99353
Store error recovery actions in the normal parse table
2016-06-27 14:07:47 -07:00
Max Brunsfeld
877fe1f682
Fix incorrect exta entry in symbol metadata table
2016-06-26 22:14:31 -07:00
Max Brunsfeld
43ae8235fd
Remove the error action; a lack of actions implies an error.
2016-06-21 22:53:48 -07:00
Max Brunsfeld
38c144b4a3
Refine logic for deciding when tokens need to be re-lexed
...
* While generating the lex table, note which tokens can match the
same string. A token needs to be relexed when it has possible
homonyms in the current state.
* Also note which tokens can match substrings of each other tokens.
A token needs to be relexed when there are viable tokens that
could match longer strings in the current state and the next
token has been edited.
* Remove the logic for marking tokens as fragile on creation.
* Store the reusability/non-reusability of symbols off of individual
actions and onto the entire entry for the state & symbol.
2016-06-21 07:28:04 -07:00
Max Brunsfeld
45f7cee0c8
Handle extra tokens properly during error recovery
2016-06-18 20:46:25 -07:00
Max Brunsfeld
94721c7ec0
Rewind and re-tokenize in error mode after detecting an error
2016-06-17 21:26:03 -07:00
Max Brunsfeld
6d40e317df
Ensure that reductions are ordered by child count in parse table
2016-06-10 13:11:52 -07:00
Max Brunsfeld
1e353381ff
Don't create error node in lexer unless token is completely invalid
...
Before, any syntax error would cause the lexer to create an error
leaf node. This could happen even with a valid input, if the parse
stack had split and one particular version of the parse stack
failed to parse.
Now, an error leaf node is only created when the lexer cannot understand
part of the input stream at all. When a normal syntax error occurs,
the lexer just returns a token that is outside of the expected token
set, and the parser handles the unexpected token.
2016-05-26 14:15:10 -07:00
Max Brunsfeld
a3679fbb1f
Distinguish separators from main tokens via a property on transitions
...
It was incorrect to store it as a property on the lexical states themselves
2016-05-19 16:27:25 -07:00
Max Brunsfeld
59712ec492
Clean up lex table generation
2016-05-19 13:25:46 -07:00
Max Brunsfeld
507d5ad9f7
Include shift-extra actions alongside other actions in recovery states
2016-05-16 10:33:18 -07:00
Max Brunsfeld
19bd09b81d
Don't include accept actions in recovery states
2016-05-11 14:02:26 -07:00
Max Brunsfeld
22c550c9d6
Discard tokens after error detection to find the best repair
...
* Use GLR stack-splitting to try all numbers of tokens to
discard until a repair is found.
* Check the validity of repairs by looking at the child trees,
rather than the statically-computed 'in-progress symbols' list
2016-05-11 13:49:43 -07:00
Max Brunsfeld
9ad1e36238
Rename out_of_context_states -> recovery_states
2016-04-27 14:14:56 -07:00
Max Brunsfeld
5b74813a5c
Refine logic for which tokens to use in error recovery
2016-04-27 14:09:19 -07:00
Max Brunsfeld
31f6b2e24a
Refactor construction of out-of-context states
2016-04-25 21:59:40 -07:00
Max Brunsfeld
cf19b2e58d
Make repeat rules left-recursive instead of right recursive
2016-04-18 12:40:14 -07:00
Max Brunsfeld
cad663b144
Consider multiple error repairs on the same path of the stack
...
This changes the API to the stack_iterate function so that you can pop
from the stack without stopping iteration
2016-04-15 21:28:00 -07:00
Max Brunsfeld
9657dfcfc3
Compute in-progress symbols for out-of-context states
2016-03-10 11:39:44 -08:00
Max Brunsfeld
b733b0cc81
Remove duplicate parse actions
...
This has only come up for out-of-context states, but it seems possible now that there
could be duplicate actions for any state, because of the possibility of multiple
actions with different precedence or associativity that are otherwise the same
2016-03-02 20:58:40 -08:00
Max Brunsfeld
b68f7212c8
Do not consider any symbols to be 'in-progress' in out-of-context states
2016-03-02 20:58:39 -08:00