Commit graph

92 commits

Author SHA1 Message Date
Max Brunsfeld
80b8a0a9fb Rename stack_right_position -> stack_total_tree_size
I want to re-use the stack data structure for storing the
re-usable nodes from the previous parse tree during an edit.
In this case, the stack won't conceptually start at position
zero, so the name 'right_position' doesn't make sense.
2014-10-08 17:37:21 -07:00
Max Brunsfeld
26ac5788b6 Don't use struct literal syntax for TSLength 2014-09-26 16:31:36 -07:00
Max Brunsfeld
c1565c1aae Track AST nodes' sizes in characters as well as bytes
The `pos` and `size` functions for Nodes now return TSLength structs,
which contain lengths in both characters and bytes. This is important
for knowing the number of unicode characters in a Node.
2014-09-26 16:15:07 -07:00
Max Brunsfeld
545e575508 Revert "Remove the separator characters construct"
This reverts commit 5cd07648fd.

The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.

Conflicts:
	src/compiler/build_tables/build_lex_table.cc
	src/compiler/grammar.cc
	src/runtime/parser.c
2014-09-02 08:03:51 -07:00
Max Brunsfeld
5cd07648fd Remove the separator characters construct
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.

For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
2014-09-01 20:19:43 -07:00
Max Brunsfeld
25a254a732 Comment and format 2014-08-31 16:24:27 -07:00
Max Brunsfeld
85d8c9df5c Handle multiple ubiquitous in a row 2014-08-31 12:11:16 -07:00
Max Brunsfeld
e6bbab41e5 Realloc parse stack when it grows to its capacity 2014-08-30 21:39:55 -07:00
Max Brunsfeld
4327f3ed26 Refactor parser and stack 2014-08-09 01:03:55 -07:00
Max Brunsfeld
1e79ed794b Allow multiple top-level nodes
Now, the root node of a document is always a document node.
It will often have only one child node which corresponds to the grammar's
start symbol, but not always. Currently, it may have more than one child
if there are ubiquitous tokens such as comments at the beginning of the
document. In the future, it will also be possible be possible to have multiple
for the document to have multiple children if the document is partially parsed.
2014-08-09 00:00:20 -07:00
Max Brunsfeld
7ba3953f7e Simplify handling of ubiquitous tokens during reduce 2014-08-08 08:46:01 -07:00
Max Brunsfeld
eecbcccee0 Remove generated parsers' dependency on the runtime library
Generated parsers no longer export a parser constructor function.
They now export an opaque Language object which can be set on
Documents directly. This way, the logic for constructing parsers
lives entirely in the runtime. The Languages are just structs which
have no load-time dependency on the runtime
2014-07-30 23:40:02 -07:00
Max Brunsfeld
98cc2f2264 Auto-format all source code with clang-format 2014-07-21 13:20:00 -07:00
Max Brunsfeld
df359bc01f Use 2-space indent in c files 2014-07-20 20:27:33 -07:00
Max Brunsfeld
779bf0d745 Don't store tree's hidden children in a separate array
Just mark hidden trees as such, and skip them when
pretty-printing a tree
2014-07-17 13:36:53 -07:00
Max Brunsfeld
25f927e321 Remove unnecessary accessor functions for tree 2014-07-14 21:11:15 -07:00
Max Brunsfeld
9da7663e99 Combine TSParser and TSStateMachine objects
My original thought was to decouple the runtime from
the LR parser generator by making TSParser a generic
interface that LR parsers implement.

I think this was more trouble than it was worth.
2014-07-10 13:23:20 -07:00
Max Brunsfeld
26f612a20d Rename type ts_stack -> TSStack 2014-06-28 19:04:14 -07:00
Max Brunsfeld
d7449bf5ea Rename type ts_symbol -> TSSymbol 2014-06-28 18:53:32 -07:00
Max Brunsfeld
7e0d46002c Rename type ts_state_id -> TSStateId 2014-06-28 18:51:06 -07:00
Max Brunsfeld
5f59de72a8 Rename type ts_tree -> TSTree 2014-06-28 18:48:07 -07:00
Max Brunsfeld
9686c57e90 Allow ubiquitous tokens to also be used in grammar rules 2014-06-26 08:52:42 -07:00
Max Brunsfeld
63cde3967c Add unit test for stack
- Also, fix bug where trees pushed onto the stack were not retained
2014-06-03 13:19:49 -07:00
Max Brunsfeld
baec9f2c9a Move computation of tree size/offset into tree constructor 2014-06-02 13:32:36 -07:00
Max Brunsfeld
ccc1b41f2a Make separate header files for stack and lexer 2014-05-09 13:32:12 -07:00
Max Brunsfeld
e4be585c43 Handle ubiquitous tokens at the beginning of programs
As a final step before returning the finished parse tree, check if
there are still multiple nodes on the stack. If so, make the inner
nodes children of the top node.
2014-05-09 12:46:36 -07:00
Max Brunsfeld
3f374c6547 Tidy up 2014-05-08 13:27:48 -07:00
Max Brunsfeld
4700e33746 Introduce 'ubiquitous_tokens' concept, for parsing comments and such 2014-05-06 12:54:04 -07:00
Max Brunsfeld
d957021982 Removed unused constant in stack.c 2014-03-26 20:50:55 -07:00
Max Brunsfeld
09e28e7859 Collapse nodes with only one child and no additional text content 2014-03-26 00:10:59 -07:00
Max Brunsfeld
316adc7788 Represent tree symbols as unsigned integers 2014-03-25 23:47:25 -07:00
Max Brunsfeld
25861b7f03 Remove reduction-specific collapse flags in favor of globally hidden symbols 2014-03-25 09:05:55 -07:00
Max Brunsfeld
671f1a1ddc Start work on javascript grammar 2014-03-24 09:14:29 -07:00
Max Brunsfeld
3a7c4bb5b1 Store AST nodes' non-hidden children 2014-03-24 01:03:32 -07:00
Max Brunsfeld
95188d84b6 Make tree struct private 2014-03-24 00:34:13 -07:00
Max Brunsfeld
5869c1ea18 Clean up stack breakdown function 2014-03-21 13:02:25 -07:00
Max Brunsfeld
fbe8b0a905 Fix incremental parsing
Stop collapsing hidden symbols upon reducing them.
Sadly, this messes up the ability to re-use parse
trees. Instead, for now, hide these nodes when
stringifying parse trees
2014-03-19 19:27:31 -07:00
Max Brunsfeld
7e94a4f1b2 Start work on reading input incrementally 2014-03-18 13:23:21 -07:00
Max Brunsfeld
fbb9b24d7b Refactor ts_tree_children 2014-03-18 12:47:26 -07:00
Max Brunsfeld
67b33a615b Refactor generated parsers to used explicit table
This is slightly slower than encoding the parse table in
flow control, but allows the parser to inspect the parse
table more flexibly. This is needed for incremental parsing.
2014-03-17 18:43:17 -07:00
Max Brunsfeld
0d6435e24a Pass edit information into parser function 2014-03-15 16:55:35 -07:00
Max Brunsfeld
0dc3a95d0c Refactor parser header
Make separate lexer, stack and parser structs.
2014-03-15 14:43:50 -07:00