Commit graph

429 commits

Author SHA1 Message Date
Max Brunsfeld
ffcd8b5c49 Generate C code for the in-progress symbols in each parse state 2016-03-02 20:56:05 -08:00
Max Brunsfeld
00d953f507 Generate C code for out-of-context states 2016-03-02 20:56:05 -08:00
Max Brunsfeld
8c01b70ce7 Don't skip tokens that are not the start of any non-terminal 2016-03-02 20:56:05 -08:00
Max Brunsfeld
b4f2407a49 Add forward move states for each terminal symbol 2016-03-02 20:56:04 -08:00
Max Brunsfeld
dee1f697c1 Compute the set of variables that can begin with each terminal symbol 2016-02-25 21:51:52 -08:00
Max Brunsfeld
3f08bfb264 Fix build warnings 2016-02-12 14:11:11 -08:00
Max Brunsfeld
b80a330a74 Fix assorted memory leaks in test code 2016-02-05 12:23:54 -08:00
Max Brunsfeld
6401a065ae Use different types for advance and accept-token actions
Unlike with parse actions, lexical actions of different types never appear
in the same places in the table
2016-01-22 22:24:11 -07:00
Max Brunsfeld
f0b1d851ce Fix uninitialized instance variable in ParseAction 2016-01-21 23:52:05 -07:00
Max Brunsfeld
569b9d4099 Allow comments within grammar JSON 2016-01-14 11:28:13 -08:00
Max Brunsfeld
49f393b75e Merge pull request #22 from maxbrunsfeld/c-compiler-api
Simplify the compiler API
2016-01-13 21:08:41 -08:00
Max Brunsfeld
d4632ab9a9 Make the compile function plain C and take a JSON grammar 2016-01-11 12:33:48 -08:00
Max Brunsfeld
b69e19c525 Add plain C API for compiling a JSON grammar 2016-01-10 13:44:22 -08:00
Max Brunsfeld
36870bfced Make Grammar a simple struct 2016-01-08 15:51:30 -08:00
Max Brunsfeld
e59f6294cb Fix bug in lexical state de-duping 2015-12-30 11:15:36 -08:00
Max Brunsfeld
4b04afac5e Control lexer's error-mode via explicit boolean argument
Previously, the lexer would operate in error-mode (ignoring any garbage input
until it found a valid token) if it was invoked in the 'error' state. Now that
the error state is deduped with other lexical states, the lexer might be invoked
in that state even when error-mode is not intended. This adds a third argument
to `ts_lex` that explicitly sets the error-mode.

This bug was unlikely to occur in any real grammars, but it caused the
node-tree-sitter-compiler test suite to fail for some grammars with only one
rule.
2015-12-30 09:43:12 -08:00
Max Brunsfeld
4ad1a666be clang-format 2015-12-29 21:17:31 -08:00
Max Brunsfeld
939476c947 When removing duplicate lex states, update the error state too
Now, instead of being stored as a separate field on the parse table, the error
state is just the first state in the states vector.
2015-12-29 21:02:24 -08:00
Max Brunsfeld
97a281502e Store parse table more compactly 2015-12-29 11:27:41 -08:00
Max Brunsfeld
a8f50986e0 clang-format 2015-12-24 22:05:54 -08:00
Max Brunsfeld
386b124866 Ensure that there are no duplicate lex states 2015-12-20 15:46:13 -08:00
Max Brunsfeld
1c6ad5f7e4 Rename ubiquitous_tokens -> extra_tokens in compiler API
They were already called this in the runtime code.
'Extra' is just easier to say.
2015-12-17 15:50:50 -08:00
Max Brunsfeld
f065eb0480 Remove unused parameter to LexConflictManager 2015-12-17 15:45:47 -08:00
Max Brunsfeld
a8d2585330 Fix resolution of shift-extra vs reduce actions 2015-12-17 15:19:58 -08:00
Max Brunsfeld
351b4f4aaa Remove unused parameters to ParseConflictManager 2015-12-17 15:19:00 -08:00
Max Brunsfeld
c495076adb Record in parse table which actions can hide splits
Suppose a parse state S has multiple actions for a terminal lookahead symbol A.
Then during incremental parsing, while in state S, the parser should not
reuse a non-terminal lookahead B where FIRST(B) contains A, because reusing B
might prematurely discard one of the possible actions that a batch parser
would have attempted in state S, upon seeing A as a lookahead.
2015-12-17 13:11:56 -08:00
Max Brunsfeld
8e7ed275c9 Remove stray file 2015-12-17 11:55:08 -08:00
Max Brunsfeld
77a94a2929 Use ::count to check if sets and maps contain elements 2015-12-17 10:05:42 -08:00
Max Brunsfeld
66144dc28e Treat tokens that are sometimes extra as fragile 2015-12-16 20:04:45 -08:00
Max Brunsfeld
d713054d61 Record which tokens are fragile when lexing 2015-12-10 21:05:54 -08:00
Max Brunsfeld
75f31a79a3 Treat reduce actions with different production IDs as distinct 2015-12-10 13:00:26 -08:00
Max Brunsfeld
26dad87299 Always mark reduce actions as fragile when they're discarded due to precedence 2015-12-06 14:09:24 -08:00
Max Brunsfeld
08d50c25ae clang-format 2015-12-04 20:56:33 -08:00
Max Brunsfeld
ad619d95f6 Add 'extra' field to symbol metadata
This stores whether a symbol is only ever used as a ubiquitous token. This will
allow ubiquitous nodes to be reused more effectively: if they are always
ubiquitous, then they can be reused immediately, and otherwise, they must be
broken down in case they need to be used structurally.
2015-12-02 15:10:24 -08:00
Max Brunsfeld
f08554e958 Replace NodeType enum with SymbolMetadata bitfield
This will allow storing other metadata about symbols, like if they
only appear as ubiquitous tokens
2015-12-02 15:10:24 -08:00
Max Brunsfeld
53424699e4 Comment all the steps of prepare_grammar 2015-12-02 14:56:59 -08:00
Max Brunsfeld
e11515fb74 Escape backslashes and quotes in symbol name strings 2015-11-09 09:33:24 -08:00
Max Brunsfeld
d5ce268074 Fix handling of changing precedence within lexical rules.
A precedence annotation wrapping a sequence of characters now only affects how
tightly those characters bind to *each other*, not how tightly they bind to the
preceding character.

This bug surfaced because a generated lexer was failing to recognize a '\n' character
as a token, instead treating it as ubiquitous whitespace. It made this error
because, even though anonymous ubiquitous tokens have the lowest precedence, the
character immediately *after* the '\n' was part of a normal token, which had
*normal* precedence (0). Advancing into that following token was incorrectly
prioritized above accepting the line-break token.
2015-11-08 13:36:15 -08:00
Max Brunsfeld
7415c623aa clang-format 2015-11-01 21:21:07 -08:00
Max Brunsfeld
5073af0d03 Extract helper method for precedence in lex_item_transitions 2015-11-01 21:20:59 -08:00
Max Brunsfeld
d7cb48aae7 Fix handling of precedence for repeat rules 2015-11-01 21:00:44 -08:00
Max Brunsfeld
d6ee28abd0 Make precedence more useful within tokens
Choose accept-token actions over advance actions if their rule has a higher precedence.
2015-11-01 12:48:27 -08:00
Max Brunsfeld
998ae533da Make completion_status() a method on LexItem 2015-10-30 16:48:37 -07:00
Max Brunsfeld
c8be143f65 🔥 get_metadata function 2015-10-30 16:22:25 -07:00
Max Brunsfeld
73b3280fbb Include precedence calculation in LexItemSet::transitions 2015-10-30 16:07:29 -07:00
Max Brunsfeld
e9be0ff24e Make completion_status() a method on ParseItem 2015-10-30 14:07:33 -07:00
Max Brunsfeld
4850384b78 Include precedence calculation in ParseItemSet::transitions 2015-10-30 13:54:11 -07:00
Max Brunsfeld
a8ead10d6f In lex error state, don't look for tokens that would match *any* line 2015-10-28 17:45:17 -07:00
Max Brunsfeld
dba0726eef clang format 2015-10-28 12:10:58 -07:00
Max Brunsfeld
b61b27f22f Handle inline ubiquitous that are used elsewhere in the grammar 2015-10-26 17:19:37 -07:00