Commit graph

113 commits

Author SHA1 Message Date
Max Brunsfeld
c495076adb Record in parse table which actions can hide splits
Suppose a parse state S has multiple actions for a terminal lookahead symbol A.
Then during incremental parsing, while in state S, the parser should not
reuse a non-terminal lookahead B where FIRST(B) contains A, because reusing B
might prematurely discard one of the possible actions that a batch parser
would have attempted in state S, upon seeing A as a lookahead.
2015-12-17 13:11:56 -08:00
Max Brunsfeld
75f31a79a3 Treat reduce actions with different production IDs as distinct 2015-12-10 13:00:26 -08:00
Max Brunsfeld
53424699e4 Comment all the steps of prepare_grammar 2015-12-02 14:56:59 -08:00
Max Brunsfeld
d5ce268074 Fix handling of changing precedence within lexical rules.
A precedence annotation wrapping a sequence of characters now only affects how
tightly those characters bind to *each other*, not how tightly they bind to the
preceding character.

This bug surfaced because a generated lexer was failing to recognize a '\n' character
as a token, instead treating it as ubiquitous whitespace. It made this error
because, even though anonymous ubiquitous tokens have the lowest precedence, the
character immediately *after* the '\n' was part of a normal token, which had
*normal* precedence (0). Advancing into that following token was incorrectly
prioritized above accepting the line-break token.
2015-11-08 13:36:15 -08:00
Max Brunsfeld
d6ee28abd0 Make precedence more useful within tokens
Choose accept-token actions over advance actions if their rule has a higher precedence.
2015-11-01 12:48:27 -08:00
Max Brunsfeld
b61b27f22f Handle inline ubiquitous that are used elsewhere in the grammar 2015-10-26 17:19:37 -07:00
Max Brunsfeld
9959fe35b0 Allow associativity to be specified in rules w/o precedence 2015-10-13 11:25:28 -07:00
Max Brunsfeld
4b817dc07c Fix linter errors 2015-10-12 19:22:05 -07:00
Max Brunsfeld
82726ad53b Define repeat rule in terms of repeat1 rule 2015-10-12 19:22:05 -07:00
Max Brunsfeld
5c67f58a4b Add helper for dynamic-casting to rule subclasses 2015-10-12 19:21:56 -07:00
Max Brunsfeld
db9966b57c Simplify lex item set transitions code 2015-10-11 22:51:37 -07:00
Max Brunsfeld
25791085c3 Normalize lexical grammar rules before constructing lex table 2015-10-11 16:56:00 -07:00
Max Brunsfeld
5e4bdcbaf8 Simplify handling of precedence & associativity in productions 2015-10-05 16:56:11 -07:00
Max Brunsfeld
ebc52f109d Merge branch 'flatten-rules-into-productions'
This branch had diverged considerably, so merging it required changing a lot
of code.

Conflicts:
	project.gyp
	spec/compiler/build_tables/action_takes_precedence_spec.cc
	spec/compiler/build_tables/build_conflict_spec.cc
	spec/compiler/build_tables/build_parse_table_spec.cc
	spec/compiler/build_tables/first_symbols_spec.cc
	spec/compiler/build_tables/item_set_closure_spec.cc
	spec/compiler/build_tables/item_set_transitions_spec.cc
	spec/compiler/build_tables/rule_can_be_blank_spec.cc
	spec/compiler/helpers/containers.h
	spec/compiler/prepare_grammar/expand_repeats_spec.cc
	spec/compiler/prepare_grammar/extract_tokens_spec.cc
	src/compiler/build_tables/action_takes_precedence.h
	src/compiler/build_tables/build_parse_table.cc
	src/compiler/build_tables/first_symbols.cc
	src/compiler/build_tables/first_symbols.h
	src/compiler/build_tables/item_set_closure.cc
	src/compiler/build_tables/item_set_transitions.cc
	src/compiler/build_tables/parse_item.cc
	src/compiler/build_tables/parse_item.h
	src/compiler/build_tables/rule_can_be_blank.cc
	src/compiler/build_tables/rule_can_be_blank.h
	src/compiler/prepare_grammar/expand_repeats.cc
	src/compiler/prepare_grammar/extract_tokens.cc
	src/compiler/prepare_grammar/extract_tokens.h
	src/compiler/prepare_grammar/prepare_grammar.cc
	src/compiler/rules/built_in_symbols.cc
	src/compiler/rules/built_in_symbols.h
	src/compiler/syntax_grammar.cc
	src/compiler/syntax_grammar.h
2015-10-02 23:46:39 -07:00
Max Brunsfeld
673ca411b1 Fix lint errors 2015-09-19 13:19:49 -07:00
Max Brunsfeld
67241e3052 Don't use std::set in public compiler header
Just use vectors
2015-09-08 23:43:45 -07:00
Max Brunsfeld
f9316933ad Refactor logic for marking '_'-prefixed rules as hidden 2015-09-06 16:53:13 -07:00
Max Brunsfeld
5982b77c97 In compiler, distinguish between anonymous tokens and hidden rules 2015-09-05 22:28:55 -07:00
Max Brunsfeld
bd77ab1ac9 Move public rule functions out of rule namespace
This way, there's only one public namespace: tree_sitter
2015-09-03 17:49:20 -07:00
Max Brunsfeld
0600f31847 🎨 Remove weird reference variables 2015-09-03 17:13:56 -07:00
Max Brunsfeld
c18351772a Auto-format: no single-line functions 2015-07-31 16:32:24 -07:00
Max Brunsfeld
93259435c8 Handle tokens that appear both anonymously and as named rules 2015-07-30 17:24:08 -07:00
Max Brunsfeld
f9b057f3a9 clang-format everything 2015-07-27 18:29:48 -07:00
Max Brunsfeld
31b2db12d2 Remove custom LexicalGrammar and SyntaxGrammar constructors 2015-07-19 16:12:11 -07:00
Max Brunsfeld
5d41d23ab1 Clean up extract_tokens 2015-07-19 11:46:30 -07:00
Max Brunsfeld
c9a482bbf3 Add expected_conflicts field to grammar 2015-06-26 16:14:08 -05:00
Max Brunsfeld
fd97b8a237 Dedup auxiliary repeat rules from different source rules 2015-05-02 20:42:47 -07:00
Max Brunsfeld
b4d93550b6 Remove const qualifier to appease gcc 2015-04-17 11:23:25 -07:00
Max Brunsfeld
e8db35af6b Avoid creating redundant auxiliary repeat rules 2015-04-16 17:42:22 -07:00
Max Brunsfeld
8ac4b9fc17 Store productions' end rule ids in the vector 2015-02-16 22:11:03 -08:00
Max Brunsfeld
52daffb3f3 Separate syntax rules into flat lists of symbols
This way, every ParseItem can be associated with a particular production
for its non-terminal. That lets us keep track of which productions are
involved in shift/reduce conflicts.
2015-02-16 22:11:03 -08:00
Max Brunsfeld
160fca6579 Refactor avoidance of redundant repeat rules 2015-01-14 21:11:19 -08:00
Max Brunsfeld
a0d9da9d5c Rename static 'Build' methods to 'build' 2015-01-14 21:11:05 -08:00
Max Brunsfeld
34d96909d1 Move {Syntax,Lexical}Grammar into separate files 2015-01-14 21:10:41 -08:00
Max Brunsfeld
4fc960e4ec Give extracted string/regex tokens more descriptive names 2014-10-31 08:56:58 -07:00
Max Brunsfeld
aae6f6de14 Remove whitespace between template closing tags 2014-10-12 11:51:12 -07:00
Max Brunsfeld
78c5fe8e02 clang-format 2014-10-03 15:44:21 -07:00
Max Brunsfeld
cb5ecbd491 Handle string and regex rules w/ non-ascii chars 2014-09-28 18:21:22 -07:00
Max Brunsfeld
209992c832 Remove trailing whitespace 2014-09-10 13:19:45 -07:00
Max Brunsfeld
9a93f6bdef Clean up prepare_grammar function 2014-09-10 13:02:31 -07:00
Max Brunsfeld
cd8a683229 Improve error messages for invalid ubiquitous tokens 2014-09-10 13:02:16 -07:00
Max Brunsfeld
2e7ffb4d14 Tweak auto-format settings
Prefer lines that exceed 80 characters by a small margin to
line breaks in argument lists
2014-09-09 13:15:40 -07:00
Max Brunsfeld
8f109504a8 Clean up extract_tokens function 2014-09-09 12:57:29 -07:00
Max Brunsfeld
9ee0665fad Remove unused code in extract_tokens.cc 2014-09-09 12:34:15 -07:00
Max Brunsfeld
e181426f6f Use make_tuple rather than init list syntax for gcc 2014-09-07 22:58:45 -07:00
Max Brunsfeld
1ff7cedf40 Unify ubiquitous tokens and lexical separators in API 2014-09-07 22:16:45 -07:00
Max Brunsfeld
a46f9d950c Handle '\s' correctly in regexps 2014-09-07 16:05:43 -07:00
Max Brunsfeld
2a9f51790f Move is_token function to its own file 2014-09-07 13:49:44 -07:00
Max Brunsfeld
ed11ef557a Fix expansion of repeat rules into recursive rules
Previously, the way repeat rules were expanded, the auxiliary
rule always needed to be reduced, even if the repeating content
was empty. This caused problems in parse states where some items
contained the repeat rule and some did not. To make those cases
work, the repeat rule had to explicitly be marked as optional.
With this change, that is no longer necessary.
2014-09-07 09:39:14 -07:00
Max Brunsfeld
d3204d3526 Include '_' in '\w' regex character class 2014-09-05 18:41:12 -07:00