Commit graph

45 commits

Author SHA1 Message Date
Max Brunsfeld
4b04afac5e Control lexer's error-mode via explicit boolean argument
Previously, the lexer would operate in error-mode (ignoring any garbage input
until it found a valid token) if it was invoked in the 'error' state. Now that
the error state is deduped with other lexical states, the lexer might be invoked
in that state even when error-mode is not intended. This adds a third argument
to `ts_lex` that explicitly sets the error-mode.

This bug was unlikely to occur in any real grammars, but it caused the
node-tree-sitter-compiler test suite to fail for some grammars with only one
rule.
2015-12-30 09:43:12 -08:00
Max Brunsfeld
939476c947 When removing duplicate lex states, update the error state too
Now, instead of being stored as a separate field on the parse table, the error
state is just the first state in the states vector.
2015-12-29 21:02:24 -08:00
Max Brunsfeld
97a281502e Store parse table more compactly 2015-12-29 11:27:41 -08:00
Max Brunsfeld
c495076adb Record in parse table which actions can hide splits
Suppose a parse state S has multiple actions for a terminal lookahead symbol A.
Then during incremental parsing, while in state S, the parser should not
reuse a non-terminal lookahead B where FIRST(B) contains A, because reusing B
might prematurely discard one of the possible actions that a batch parser
would have attempted in state S, upon seeing A as a lookahead.
2015-12-17 13:11:56 -08:00
Max Brunsfeld
66144dc28e Treat tokens that are sometimes extra as fragile 2015-12-16 20:04:45 -08:00
Max Brunsfeld
9bff4d0b06 Add concise method syntax to javascript fixture grammar
This exposes an ambiguity handling bug that I discovered while adding ES6 support to
tree-sitter-javascript
2015-12-15 22:25:48 -08:00
Max Brunsfeld
d713054d61 Record which tokens are fragile when lexing 2015-12-10 21:05:54 -08:00
Max Brunsfeld
75f31a79a3 Treat reduce actions with different production IDs as distinct 2015-12-10 13:00:26 -08:00
Max Brunsfeld
76e4599d5e For now, allow any expression as an assignment LHS 2015-12-06 14:14:17 -08:00
Max Brunsfeld
ad619d95f6 Add 'extra' field to symbol metadata
This stores whether a symbol is only ever used as a ubiquitous token. This will
allow ubiquitous nodes to be reused more effectively: if they are always
ubiquitous, then they can be reused immediately, and otherwise, they must be
broken down in case they need to be used structurally.
2015-12-02 15:10:24 -08:00
Max Brunsfeld
f08554e958 Replace NodeType enum with SymbolMetadata bitfield
This will allow storing other metadata about symbols, like if they
only appear as ubiquitous tokens
2015-12-02 15:10:24 -08:00
Max Brunsfeld
d7cb48aae7 Fix handling of precedence for repeat rules 2015-11-01 21:00:44 -08:00
Max Brunsfeld
d6ee28abd0 Make precedence more useful within tokens
Choose accept-token actions over advance actions if their rule has a higher precedence.
2015-11-01 12:48:27 -08:00
Max Brunsfeld
998ae533da Make completion_status() a method on LexItem 2015-10-30 16:48:37 -07:00
Max Brunsfeld
1983bcfb60 Fix conflation of finished items w/ different precedence 2015-10-18 12:51:32 -07:00
Max Brunsfeld
8725e96a65 Fix item-set-closure bug w/ empty productions 2015-10-15 23:59:47 -07:00
Max Brunsfeld
82726ad53b Define repeat rule in terms of repeat1 rule 2015-10-12 19:22:05 -07:00
Max Brunsfeld
624e4651d2 Use GLR for for-in loop conlfict in javascript grammar 2015-10-06 10:50:56 -07:00
Max Brunsfeld
ebc52f109d Merge branch 'flatten-rules-into-productions'
This branch had diverged considerably, so merging it required changing a lot
of code.

Conflicts:
	project.gyp
	spec/compiler/build_tables/action_takes_precedence_spec.cc
	spec/compiler/build_tables/build_conflict_spec.cc
	spec/compiler/build_tables/build_parse_table_spec.cc
	spec/compiler/build_tables/first_symbols_spec.cc
	spec/compiler/build_tables/item_set_closure_spec.cc
	spec/compiler/build_tables/item_set_transitions_spec.cc
	spec/compiler/build_tables/rule_can_be_blank_spec.cc
	spec/compiler/helpers/containers.h
	spec/compiler/prepare_grammar/expand_repeats_spec.cc
	spec/compiler/prepare_grammar/extract_tokens_spec.cc
	src/compiler/build_tables/action_takes_precedence.h
	src/compiler/build_tables/build_parse_table.cc
	src/compiler/build_tables/first_symbols.cc
	src/compiler/build_tables/first_symbols.h
	src/compiler/build_tables/item_set_closure.cc
	src/compiler/build_tables/item_set_transitions.cc
	src/compiler/build_tables/parse_item.cc
	src/compiler/build_tables/parse_item.h
	src/compiler/build_tables/rule_can_be_blank.cc
	src/compiler/build_tables/rule_can_be_blank.h
	src/compiler/prepare_grammar/expand_repeats.cc
	src/compiler/prepare_grammar/extract_tokens.cc
	src/compiler/prepare_grammar/extract_tokens.h
	src/compiler/prepare_grammar/prepare_grammar.cc
	src/compiler/rules/built_in_symbols.cc
	src/compiler/rules/built_in_symbols.h
	src/compiler/syntax_grammar.cc
	src/compiler/syntax_grammar.h
2015-10-02 23:46:39 -07:00
Max Brunsfeld
d032114d7a js grammar - recover from errors on semicolons but not line-breaks 2015-09-22 21:12:28 -07:00
Max Brunsfeld
7ee5eaa16a Rename node accessor methods
Instead of child() vs concrete_child(), next_sibling() vs next_concrete_sibling(), etc,
the default is switched: child() refers to the concrete syntax tree, and named_child()
refers to the AST. Because the AST is abstract through exclusion of some nodes, the
names are clearer if the qualifier goes on the AST operations
2015-09-08 23:16:24 -07:00
Max Brunsfeld
f9316933ad Refactor logic for marking '_'-prefixed rules as hidden 2015-09-06 16:53:13 -07:00
Max Brunsfeld
9591c88f39 In runtime, distinguish between anonymous and hidden nodes 2015-09-06 00:12:37 -07:00
Max Brunsfeld
5982b77c97 In compiler, distinguish between anonymous tokens and hidden rules 2015-09-05 22:28:55 -07:00
Max Brunsfeld
acf9280eda Make expression and statement rules hidden in javascript grammar 2015-09-02 13:05:54 -07:00
Max Brunsfeld
21258e6a9e Remove 'document' wrapper node 2015-08-22 10:48:34 -07:00
Max Brunsfeld
0b1d70db34 Always resolve ambiguities immediately
No more ambiguity nodes.
Also, when merging parse stacks, merge their successors if needed.
2015-07-15 13:15:11 -07:00
Max Brunsfeld
f26ddf5187 Fix symbol name for ambiguity nodes 2015-07-08 17:31:21 -07:00
Max Brunsfeld
381f89f8ba Create ambiguity nodes when joining stack heads 2015-06-18 17:03:16 -07:00
Max Brunsfeld
755894b44d Allow multiple parse actions in parse table 2015-06-18 17:03:16 -07:00
Max Brunsfeld
44774119bf Use all caps for built-in symbol names 2015-06-15 15:26:05 -07:00
Max Brunsfeld
fd97b8a237 Dedup auxiliary repeat rules from different source rules 2015-05-02 20:42:47 -07:00
Max Brunsfeld
a19b0e75ac 🔥 keyword and keypattern functions
Just make strings have higher precedence than regexps.
2015-03-22 16:00:26 -07:00
Max Brunsfeld
80ec303b10 Replace prec rule w/ left_assoc and right_assoc
Consider shift/reduce conflicts to be compilation errors unless
they are resolved by a specified associativity.
2015-03-16 23:12:34 -07:00
Max Brunsfeld
9a198562e0 Treat parse conflicts as errors in grammar compilation
For now, only reduce/reduce conflicts w/ no tie-breaking precedence
are treated as errors. The rest are dropped, because shift/reduce
conflicts are currently very common because we don't have a way
of specifying associativity along w/ precedence.
2015-03-15 20:31:41 -07:00
Max Brunsfeld
2d436cf141 Identify fragile reductions at compile time 2015-02-21 15:11:03 -08:00
Max Brunsfeld
bee756b414 Regenerate fixture parsers with better token names 2014-11-01 17:00:49 -07:00
Max Brunsfeld
d8099826f9 Workaround duplicate symbol problem in golang grammar
Currently, a grammar can't have a string rule and a keyword rule with
the same string.
2014-10-21 13:03:50 -07:00
Max Brunsfeld
8a03bd8978 Comment out some parts of JS grammar to speed up tests 2014-10-21 09:01:56 -07:00
Max Brunsfeld
91cf35b72c Rework javascript fixture grammar 2014-10-21 08:49:16 -07:00
Max Brunsfeld
22ee68e1a9 Make node for each var assignment in JS grammar 2014-10-15 15:04:57 -07:00
Max Brunsfeld
8c3ea6ef30 In JS grammar, don't include enclosing parens in formal_parameters node 2014-10-10 17:11:14 -07:00
Max Brunsfeld
3bcb221379 Include non-terminal lookahead symbols for reduction actions
This is necessary for re-using the right subtree after an edit
2014-10-10 12:06:16 -07:00
Max Brunsfeld
68d6e242ee Fix parsing of wildcard patterns at the ends of documents
- Remove special EOF handling from lexer
- Explicitly exclude the EOF character from all-inclusive character sets.
2014-09-11 13:10:23 -07:00
Max Brunsfeld
9682ef6c79 Rename examples directory to spec/fixtures
- I want to move away from having complete grammars for real languages
  (e.g. javascript, golang) in this repo. These languages take a long
  time to compile, and they now exist in their own repos
  (node-tree-sitter-javascript etc).
- I want to start testing more compiler edge cases through integration
  tests, so I want to put more small, weird grammars in here. That makes
  me not want to call the directory `examples`.
2014-09-10 13:31:06 -07:00
Renamed from examples/parsers/javascript.c (Browse further)