Commit graph

85 commits

Author SHA1 Message Date
Max Brunsfeld
545e575508 Revert "Remove the separator characters construct"
This reverts commit 5cd07648fd.

The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.

Conflicts:
	src/compiler/build_tables/build_lex_table.cc
	src/compiler/grammar.cc
	src/runtime/parser.c
2014-09-02 08:03:51 -07:00
Max Brunsfeld
5cd07648fd Remove the separator characters construct
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.

For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
2014-09-01 20:19:43 -07:00
Max Brunsfeld
2985a98150 Build error nodes in lexer again, not in parser 2014-08-31 16:59:01 -07:00
Max Brunsfeld
226ffd6b5b Fix initializer list deduction warnings in specs 2014-08-27 22:23:45 -07:00
Max Brunsfeld
e0a53b9f14 Make parse and lex debug output more readable 2014-08-27 18:27:53 -07:00
Max Brunsfeld
bd145d2c6a Preserve the initial error node in handle_error function 2014-08-26 23:22:18 -07:00
Max Brunsfeld
37d5db6fee Move newline in lexer debugging output 2014-08-26 22:21:21 -07:00
Max Brunsfeld
346cf4fe5d Remove LEX_PANIC macro 2014-08-26 13:12:12 -07:00
Max Brunsfeld
77941c85ff Avoid building incomplete error nodes during lexing
The lexer doesn't know the expected symbols, so it doesn't have enough
information to construct error nodes. Now, when it encounters an invalid
character, it returns NULL and the parser builds a correct error node.
2014-08-25 23:35:00 -07:00
Max Brunsfeld
412cc93812 clang format 2014-07-31 13:11:39 -07:00
Max Brunsfeld
0d6d09cbd9 In generated parsers, export language as a function 2014-07-31 13:04:46 -07:00
Max Brunsfeld
909261d742 Remove unnecessary type name in parser.h 2014-07-31 12:31:38 -07:00
Max Brunsfeld
eecbcccee0 Remove generated parsers' dependency on the runtime library
Generated parsers no longer export a parser constructor function.
They now export an opaque Language object which can be set on
Documents directly. This way, the logic for constructing parsers
lives entirely in the runtime. The Languages are just structs which
have no load-time dependency on the runtime
2014-07-30 23:40:02 -07:00
Max Brunsfeld
98cc2f2264 Auto-format all source code with clang-format 2014-07-21 13:20:00 -07:00
Max Brunsfeld
df359bc01f Use 2-space indent in c files 2014-07-20 20:27:33 -07:00
Max Brunsfeld
b3385f20c8 Hide TSTree, expose TSNode 2014-07-17 23:29:11 -07:00
Max Brunsfeld
779bf0d745 Don't store tree's hidden children in a separate array
Just mark hidden trees as such, and skip them when
pretty-printing a tree
2014-07-17 13:36:53 -07:00
Max Brunsfeld
9da7663e99 Combine TSParser and TSStateMachine objects
My original thought was to decouple the runtime from
the LR parser generator by making TSParser a generic
interface that LR parsers implement.

I think this was more trouble than it was worth.
2014-07-10 13:23:20 -07:00
Max Brunsfeld
83a1b9439e Fix handling of ubiquitous tokens used in grammar rules 2014-07-01 20:47:35 -07:00
Max Brunsfeld
0ec3faba3e Rename type ts_lr_parser -> TSStateMachine 2014-06-28 19:22:16 -07:00
Max Brunsfeld
27f6eb725d Rename type ts_parse_action -> TSParseAction 2014-06-28 19:06:37 -07:00
Max Brunsfeld
9d4fcf75de Rename type ts_lexer, ts_parser -> TSLexer, TSParser 2014-06-28 19:01:46 -07:00
Max Brunsfeld
c8797bfa27 Rename type ts_input_edit -> TSInputEdit 2014-06-28 18:56:47 -07:00
Max Brunsfeld
ff13122419 Rename type ts_input -> TSInput 2014-06-28 18:56:04 -07:00
Max Brunsfeld
7e0d46002c Rename type ts_state_id -> TSStateId 2014-06-28 18:51:06 -07:00
Max Brunsfeld
5f59de72a8 Rename type ts_tree -> TSTree 2014-06-28 18:48:07 -07:00
Max Brunsfeld
9686c57e90 Allow ubiquitous tokens to also be used in grammar rules 2014-06-26 08:52:42 -07:00
Max Brunsfeld
155a57d3ab Prevent infinite loop on certain lex errors 2014-06-11 11:49:06 -07:00
Max Brunsfeld
21c259df9c Clean up lint errors 2014-06-09 21:14:38 -07:00
Max Brunsfeld
12331d66f5 Fix memory leaks 2014-06-09 13:12:44 -07:00
Max Brunsfeld
9a4889176e Move lr_parser implementation into a separate .c file 2014-06-04 13:34:37 -07:00
Max Brunsfeld
e93e254518 In lexer, prefer tokens to skipped separator characters
This was causing newlines in go and javascript to be parsed as
meaningless separator characters instead of statement terminators
2014-05-30 13:29:54 -07:00
Max Brunsfeld
3e0debf814 Fix compile error on gcc 2014-05-09 15:37:30 -07:00
Max Brunsfeld
292b753914 Move lr_parser into its own header file 2014-05-09 14:43:43 -07:00
Max Brunsfeld
ccc1b41f2a Make separate header files for stack and lexer 2014-05-09 13:32:12 -07:00
Max Brunsfeld
e4be585c43 Handle ubiquitous tokens at the beginning of programs
As a final step before returning the finished parse tree, check if
there are still multiple nodes on the stack. If so, make the inner
nodes children of the top node.
2014-05-09 12:46:36 -07:00
Max Brunsfeld
3f374c6547 Tidy up 2014-05-08 13:27:48 -07:00
Max Brunsfeld
34137be12d Represent state ids as unsigned shorts
This fixes some signedness conversion warnings
2014-05-08 13:23:46 -07:00
Max Brunsfeld
013572671f Use smaller integer types for parse table 2014-05-08 08:45:41 -07:00
Max Brunsfeld
4700e33746 Introduce 'ubiquitous_tokens' concept, for parsing comments and such 2014-05-06 12:54:04 -07:00
Max Brunsfeld
5708a181c2 Removed unused field on reduce parse actions 2014-05-01 23:29:01 -07:00
Max Brunsfeld
a437d39773 Add rule precedence construct
Still need to add some way of expressing left and right
associativity
2014-04-15 08:40:46 -07:00
Max Brunsfeld
e23604ac52 Fix debugging macros in parser.h 2014-04-14 22:31:11 -07:00
Max Brunsfeld
5145bba53d Silence missing-initializer warnings for gcc 2014-04-12 20:16:16 -07:00
Max Brunsfeld
bd5ec68c96 Get generated parsers building under gcc 2014-04-08 22:11:20 -07:00
Max Brunsfeld
5320cad065 Trim trailing whitespace 2014-04-04 13:10:55 -07:00
Max Brunsfeld
1cc7e32e2d Fix handling of tokens consisting of separator characters
The parser is no longer hard-coded to skip whitespace. Tokens
such as newlines, whose characters overlap with the separator
characters, can now be correctly recognized.
2014-04-03 19:10:09 -07:00
Max Brunsfeld
13c4e6e648 Tweak format for example grammars 2014-03-28 13:51:32 -07:00
Max Brunsfeld
4454925b5a Clean up parser macros more 2014-03-26 13:03:12 -07:00
Max Brunsfeld
f601322956 Clean up macros in parser.h 2014-03-25 19:51:34 -07:00