Commit graph

60 commits

Author SHA1 Message Date
Max Brunsfeld
0daada6921 Tidy up c-code generator 2014-10-12 12:33:45 -07:00
Max Brunsfeld
5b624d37f6 Use consistent iteration in c code generator 2014-10-12 11:57:52 -07:00
Max Brunsfeld
2e7ffb4d14 Tweak auto-format settings
Prefer lines that exceed 80 characters by a small margin to
line breaks in argument lists
2014-09-09 13:15:40 -07:00
Max Brunsfeld
1ff7cedf40 Unify ubiquitous tokens and lexical separators in API 2014-09-07 22:16:45 -07:00
Max Brunsfeld
c0a3f8d39c Remove some macros from public parser header 2014-09-05 23:47:38 -07:00
Max Brunsfeld
545e575508 Revert "Remove the separator characters construct"
This reverts commit 5cd07648fd.

The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.

Conflicts:
	src/compiler/build_tables/build_lex_table.cc
	src/compiler/grammar.cc
	src/runtime/parser.c
2014-09-02 08:03:51 -07:00
Max Brunsfeld
5cd07648fd Remove the separator characters construct
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.

For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
2014-09-01 20:19:43 -07:00
Max Brunsfeld
346cf4fe5d Remove LEX_PANIC macro 2014-08-26 13:12:12 -07:00
Max Brunsfeld
0bb5663f0f Refactor - represent char sets in terms of inclusions and exclusions 2014-08-23 14:25:45 -07:00
Max Brunsfeld
fcdcdca303 Whitespace 2014-08-09 01:07:14 -07:00
Max Brunsfeld
1e79ed794b Allow multiple top-level nodes
Now, the root node of a document is always a document node.
It will often have only one child node which corresponds to the grammar's
start symbol, but not always. Currently, it may have more than one child
if there are ubiquitous tokens such as comments at the beginning of the
document. In the future, it will also be possible be possible to have multiple
for the document to have multiple children if the document is partially parsed.
2014-08-09 00:00:20 -07:00
Max Brunsfeld
8da9219c3a Remove redundant functions for Documents
There's no need for a `string` function since one already
exists for Nodes.

Now the root node is always stored on the document. This
means callers of `ts_document_root_node` don't need to release
its return value.
2014-08-08 12:58:51 -07:00
Max Brunsfeld
9366f11dcb In generated C, only format printable characters as char literals 2014-08-07 08:12:15 -07:00
Max Brunsfeld
eecbcccee0 Remove generated parsers' dependency on the runtime library
Generated parsers no longer export a parser constructor function.
They now export an opaque Language object which can be set on
Documents directly. This way, the logic for constructing parsers
lives entirely in the runtime. The Languages are just structs which
have no load-time dependency on the runtime
2014-07-30 23:40:02 -07:00
Max Brunsfeld
98cc2f2264 Auto-format all source code with clang-format 2014-07-21 13:20:00 -07:00
Max Brunsfeld
18ae326459 Fix lint errors 2014-07-02 09:01:38 -07:00
Max Brunsfeld
83a1b9439e Fix handling of ubiquitous tokens used in grammar rules 2014-07-01 20:47:35 -07:00
Max Brunsfeld
9686c57e90 Allow ubiquitous tokens to also be used in grammar rules 2014-06-26 08:52:42 -07:00
Max Brunsfeld
7df35f9b8d Make separate types for syntax and lexical grammars
This way, the separator characters can be added as a field to
lexical grammars only
2014-06-25 13:27:16 -07:00
Max Brunsfeld
240de1ca61 Reorder methods in c code generator 2014-06-18 08:21:43 -07:00
Max Brunsfeld
c312f985c8 Refactor c code generator
It's been rewritten in a less functional style. String copies were actually
taking significant time for large parsers.
2014-06-16 21:29:04 -07:00
Max Brunsfeld
bb4d83ce47 Add regex postfix flags to javascript grammar
- Refactor statement terminators in javascript grammar
- Reorganize javascript language tests
2014-06-11 16:43:27 -07:00
Max Brunsfeld
e105f5cebc Remove inheritance link btwn PreparedGrammar and Grammar 2014-06-10 10:34:37 -07:00
Max Brunsfeld
e93e254518 In lexer, prefer tokens to skipped separator characters
This was causing newlines in go and javascript to be parsed as
meaningless separator characters instead of statement terminators
2014-05-30 13:29:54 -07:00
Max Brunsfeld
c30055ba18 Fix symbol names for extracted tokens 2014-05-20 08:30:58 -07:00
Max Brunsfeld
649f200831 Expand regex/string rules as part of grammar preparation
This makes it possible to report errors in regex parsing
2014-05-19 20:54:59 -07:00
Max Brunsfeld
3e0debf814 Fix compile error on gcc 2014-05-09 15:37:30 -07:00
Max Brunsfeld
34137be12d Represent state ids as unsigned shorts
This fixes some signedness conversion warnings
2014-05-08 13:23:46 -07:00
Max Brunsfeld
0a21eee3f0 Remove magic number from generated symbols enums
The symbol numbers 0 and 1 are reserved for 'error' and 'eof',
so the grammar's start symbol is always 2.
2014-05-08 13:14:45 -07:00
Max Brunsfeld
4700e33746 Introduce 'ubiquitous_tokens' concept, for parsing comments and such 2014-05-06 12:54:04 -07:00
Max Brunsfeld
0d763d229d cpplint 2014-04-28 21:46:43 -07:00
Max Brunsfeld
25eda9d889 ISymbol -> Symbol
Interned symbols are now the main type of symbol in use
2014-04-28 20:43:27 -07:00
Max Brunsfeld
93df5579b4 Trim whitespace 2014-04-25 22:17:23 -07:00
Max Brunsfeld
68d44fd565 Intern symbols during grammar preparation 2014-04-22 23:38:26 -07:00
Max Brunsfeld
a437d39773 Add rule precedence construct
Still need to add some way of expressing left and right
associativity
2014-04-15 08:40:46 -07:00
Max Brunsfeld
5145bba53d Silence missing-initializer warnings for gcc 2014-04-12 20:16:16 -07:00
Max Brunsfeld
bd5ec68c96 Get generated parsers building under gcc 2014-04-08 22:11:20 -07:00
Max Brunsfeld
6a0a28f4b3 WIP - try to fix travis build 2014-04-08 21:41:38 -07:00
Max Brunsfeld
1cc7e32e2d Fix handling of tokens consisting of separator characters
The parser is no longer hard-coded to skip whitespace. Tokens
such as newlines, whose characters overlap with the separator
characters, can now be correctly recognized.
2014-04-03 19:10:09 -07:00
Max Brunsfeld
a79a7435de Remove remaining trailing whitespace from generated c code 2014-03-29 19:21:42 -07:00
Max Brunsfeld
8e1b78ca8e Remove trailing whitespace from generated c code 2014-03-29 19:00:31 -07:00
Max Brunsfeld
13c4e6e648 Tweak format for example grammars 2014-03-28 13:51:32 -07:00
Max Brunsfeld
e1ac62edc5 Give better symbol names to generated tokens
This should make debugging easier
2014-03-27 12:54:54 -07:00
Max Brunsfeld
3f770ff3c3 Remove unused consumed_symbols vector from parse items 2014-03-26 21:04:11 -07:00
Max Brunsfeld
4454925b5a Clean up parser macros more 2014-03-26 13:03:12 -07:00
Max Brunsfeld
f601322956 Clean up macros in parser.h 2014-03-25 19:51:34 -07:00
Max Brunsfeld
80b19cbb83 Construct entire parse table statically
This removes the need for the 'init_parse_table' function,
which was not really thread safe
2014-03-25 19:34:17 -07:00
Max Brunsfeld
25861b7f03 Remove reduction-specific collapse flags in favor of globally hidden symbols 2014-03-25 09:05:55 -07:00
Max Brunsfeld
aac0786449 Resolve token conflicts by tokens' order in grammar 2014-03-24 19:18:06 -07:00
Max Brunsfeld
9cb92a0a96 Remove code related to old error recovery function 2014-03-24 13:17:38 -07:00