Max Brunsfeld
5dc08ccce9
Include names of in-progress rules in shift/reduce conflicts
2014-11-05 18:39:50 -08:00
Max Brunsfeld
fc83322832
Handle more special characters in c symbol names
2014-10-31 18:37:56 -07:00
Max Brunsfeld
4fc960e4ec
Give extracted string/regex tokens more descriptive names
2014-10-31 08:56:58 -07:00
Max Brunsfeld
0daada6921
Tidy up c-code generator
2014-10-12 12:33:45 -07:00
Max Brunsfeld
5b624d37f6
Use consistent iteration in c code generator
2014-10-12 11:57:52 -07:00
Max Brunsfeld
2e7ffb4d14
Tweak auto-format settings
...
Prefer lines that exceed 80 characters by a small margin to
line breaks in argument lists
2014-09-09 13:15:40 -07:00
Max Brunsfeld
1ff7cedf40
Unify ubiquitous tokens and lexical separators in API
2014-09-07 22:16:45 -07:00
Max Brunsfeld
c0a3f8d39c
Remove some macros from public parser header
2014-09-05 23:47:38 -07:00
Max Brunsfeld
545e575508
Revert "Remove the separator characters construct"
...
This reverts commit 5cd07648fd .
The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.
Conflicts:
src/compiler/build_tables/build_lex_table.cc
src/compiler/grammar.cc
src/runtime/parser.c
2014-09-02 08:03:51 -07:00
Max Brunsfeld
5cd07648fd
Remove the separator characters construct
...
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.
For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
2014-09-01 20:19:43 -07:00
Max Brunsfeld
346cf4fe5d
Remove LEX_PANIC macro
2014-08-26 13:12:12 -07:00
Max Brunsfeld
0bb5663f0f
Refactor - represent char sets in terms of inclusions and exclusions
2014-08-23 14:25:45 -07:00
Max Brunsfeld
fcdcdca303
Whitespace
2014-08-09 01:07:14 -07:00
Max Brunsfeld
1e79ed794b
Allow multiple top-level nodes
...
Now, the root node of a document is always a document node.
It will often have only one child node which corresponds to the grammar's
start symbol, but not always. Currently, it may have more than one child
if there are ubiquitous tokens such as comments at the beginning of the
document. In the future, it will also be possible be possible to have multiple
for the document to have multiple children if the document is partially parsed.
2014-08-09 00:00:20 -07:00
Max Brunsfeld
8da9219c3a
Remove redundant functions for Documents
...
There's no need for a `string` function since one already
exists for Nodes.
Now the root node is always stored on the document. This
means callers of `ts_document_root_node` don't need to release
its return value.
2014-08-08 12:58:51 -07:00
Max Brunsfeld
9366f11dcb
In generated C, only format printable characters as char literals
2014-08-07 08:12:15 -07:00
Max Brunsfeld
eecbcccee0
Remove generated parsers' dependency on the runtime library
...
Generated parsers no longer export a parser constructor function.
They now export an opaque Language object which can be set on
Documents directly. This way, the logic for constructing parsers
lives entirely in the runtime. The Languages are just structs which
have no load-time dependency on the runtime
2014-07-30 23:40:02 -07:00
Max Brunsfeld
98cc2f2264
Auto-format all source code with clang-format
2014-07-21 13:20:00 -07:00
Max Brunsfeld
18ae326459
Fix lint errors
2014-07-02 09:01:38 -07:00
Max Brunsfeld
83a1b9439e
Fix handling of ubiquitous tokens used in grammar rules
2014-07-01 20:47:35 -07:00
Max Brunsfeld
9686c57e90
Allow ubiquitous tokens to also be used in grammar rules
2014-06-26 08:52:42 -07:00
Max Brunsfeld
7df35f9b8d
Make separate types for syntax and lexical grammars
...
This way, the separator characters can be added as a field to
lexical grammars only
2014-06-25 13:27:16 -07:00
Max Brunsfeld
240de1ca61
Reorder methods in c code generator
2014-06-18 08:21:43 -07:00
Max Brunsfeld
c312f985c8
Refactor c code generator
...
It's been rewritten in a less functional style. String copies were actually
taking significant time for large parsers.
2014-06-16 21:29:04 -07:00
Max Brunsfeld
bb4d83ce47
Add regex postfix flags to javascript grammar
...
- Refactor statement terminators in javascript grammar
- Reorganize javascript language tests
2014-06-11 16:43:27 -07:00
Max Brunsfeld
e105f5cebc
Remove inheritance link btwn PreparedGrammar and Grammar
2014-06-10 10:34:37 -07:00
Max Brunsfeld
e93e254518
In lexer, prefer tokens to skipped separator characters
...
This was causing newlines in go and javascript to be parsed as
meaningless separator characters instead of statement terminators
2014-05-30 13:29:54 -07:00
Max Brunsfeld
c30055ba18
Fix symbol names for extracted tokens
2014-05-20 08:30:58 -07:00
Max Brunsfeld
649f200831
Expand regex/string rules as part of grammar preparation
...
This makes it possible to report errors in regex parsing
2014-05-19 20:54:59 -07:00
Max Brunsfeld
3e0debf814
Fix compile error on gcc
2014-05-09 15:37:30 -07:00
Max Brunsfeld
34137be12d
Represent state ids as unsigned shorts
...
This fixes some signedness conversion warnings
2014-05-08 13:23:46 -07:00
Max Brunsfeld
0a21eee3f0
Remove magic number from generated symbols enums
...
The symbol numbers 0 and 1 are reserved for 'error' and 'eof',
so the grammar's start symbol is always 2.
2014-05-08 13:14:45 -07:00
Max Brunsfeld
4700e33746
Introduce 'ubiquitous_tokens' concept, for parsing comments and such
2014-05-06 12:54:04 -07:00
Max Brunsfeld
0d763d229d
cpplint
2014-04-28 21:46:43 -07:00
Max Brunsfeld
25eda9d889
ISymbol -> Symbol
...
Interned symbols are now the main type of symbol in use
2014-04-28 20:43:27 -07:00
Max Brunsfeld
93df5579b4
Trim whitespace
2014-04-25 22:17:23 -07:00
Max Brunsfeld
68d44fd565
Intern symbols during grammar preparation
2014-04-22 23:38:26 -07:00
Max Brunsfeld
a437d39773
Add rule precedence construct
...
Still need to add some way of expressing left and right
associativity
2014-04-15 08:40:46 -07:00
Max Brunsfeld
5145bba53d
Silence missing-initializer warnings for gcc
2014-04-12 20:16:16 -07:00
Max Brunsfeld
bd5ec68c96
Get generated parsers building under gcc
2014-04-08 22:11:20 -07:00
Max Brunsfeld
6a0a28f4b3
WIP - try to fix travis build
2014-04-08 21:41:38 -07:00
Max Brunsfeld
1cc7e32e2d
Fix handling of tokens consisting of separator characters
...
The parser is no longer hard-coded to skip whitespace. Tokens
such as newlines, whose characters overlap with the separator
characters, can now be correctly recognized.
2014-04-03 19:10:09 -07:00
Max Brunsfeld
a79a7435de
Remove remaining trailing whitespace from generated c code
2014-03-29 19:21:42 -07:00
Max Brunsfeld
8e1b78ca8e
Remove trailing whitespace from generated c code
2014-03-29 19:00:31 -07:00
Max Brunsfeld
13c4e6e648
Tweak format for example grammars
2014-03-28 13:51:32 -07:00
Max Brunsfeld
e1ac62edc5
Give better symbol names to generated tokens
...
This should make debugging easier
2014-03-27 12:54:54 -07:00
Max Brunsfeld
3f770ff3c3
Remove unused consumed_symbols vector from parse items
2014-03-26 21:04:11 -07:00
Max Brunsfeld
4454925b5a
Clean up parser macros more
2014-03-26 13:03:12 -07:00
Max Brunsfeld
f601322956
Clean up macros in parser.h
2014-03-25 19:51:34 -07:00
Max Brunsfeld
80b19cbb83
Construct entire parse table statically
...
This removes the need for the 'init_parse_table' function,
which was not really thread safe
2014-03-25 19:34:17 -07:00