Commit graph

19 commits

Author SHA1 Message Date
Max Brunsfeld
26ac5788b6 Don't use struct literal syntax for TSLength 2014-09-26 16:31:36 -07:00
Max Brunsfeld
c1565c1aae Track AST nodes' sizes in characters as well as bytes
The `pos` and `size` functions for Nodes now return TSLength structs,
which contain lengths in both characters and bytes. This is important
for knowing the number of unicode characters in a Node.
2014-09-26 16:15:07 -07:00
Max Brunsfeld
141cbcfa02 Read unicode characters using utf8proc 2014-09-13 00:24:10 -07:00
Max Brunsfeld
e23f11b7c4 Allow lexical debug mode to be enabled on documents
- `ts_document_set_debug(doc, 1)` implies parse debug mode
- `ts_document_set_debug(doc, > 1)` implies parse and lex debug mode
2014-09-11 13:12:06 -07:00
Max Brunsfeld
545e575508 Revert "Remove the separator characters construct"
This reverts commit 5cd07648fd.

The separators construct is useful as an optimization. It turns out that
constructing a node for every chunk of whitespace in a document causes a
significant performance regression.

Conflicts:
	src/compiler/build_tables/build_lex_table.cc
	src/compiler/grammar.cc
	src/runtime/parser.c
2014-09-02 08:03:51 -07:00
Max Brunsfeld
5cd07648fd Remove the separator characters construct
Now, grammars can handle whitespace by making it another ubiquitous
token, like comments.

For now, this has the side effect of whitespace being included in the
tree that precedes it. This was already an issue for other ubiquitous
tokens though, so it needs to be fixed anyway.
2014-09-01 20:19:43 -07:00
Max Brunsfeld
2985a98150 Build error nodes in lexer again, not in parser 2014-08-31 16:59:01 -07:00
Max Brunsfeld
25a254a732 Comment and format 2014-08-31 16:24:27 -07:00
Max Brunsfeld
604b149c4b Assign sizes to error nodes in handle_error 2014-08-28 18:35:30 -07:00
Max Brunsfeld
3430a5edcc Clarify distinction btwn tree padding, tree offset, node position
- Node position is public. It represents the node's first character
  index in the document.
- Tree offset is private. It represents the distance between the tree's
  first character index and it's parent's first character index.
- Tree padding is private. It represents the amount of whitespace
  (or other separator characters) immediately preceding the tree.
2014-08-28 13:22:06 -07:00
Max Brunsfeld
bd145d2c6a Preserve the initial error node in handle_error function 2014-08-26 23:22:18 -07:00
Max Brunsfeld
4720672cfb In lexer, stop reading input as soon as empty string is returned 2014-08-01 13:27:50 -07:00
Max Brunsfeld
412cc93812 clang format 2014-07-31 13:11:39 -07:00
Max Brunsfeld
eecbcccee0 Remove generated parsers' dependency on the runtime library
Generated parsers no longer export a parser constructor function.
They now export an opaque Language object which can be set on
Documents directly. This way, the logic for constructing parsers
lives entirely in the runtime. The Languages are just structs which
have no load-time dependency on the runtime
2014-07-30 23:40:02 -07:00
Max Brunsfeld
98cc2f2264 Auto-format all source code with clang-format 2014-07-21 13:20:00 -07:00
Max Brunsfeld
df359bc01f Use 2-space indent in c files 2014-07-20 20:27:33 -07:00
Max Brunsfeld
b3385f20c8 Hide TSTree, expose TSNode 2014-07-17 23:29:11 -07:00
Max Brunsfeld
779bf0d745 Don't store tree's hidden children in a separate array
Just mark hidden trees as such, and skip them when
pretty-printing a tree
2014-07-17 13:36:53 -07:00
Max Brunsfeld
9da7663e99 Combine TSParser and TSStateMachine objects
My original thought was to decouple the runtime from
the LR parser generator by making TSParser a generic
interface that LR parsers implement.

I think this was more trouble than it was worth.
2014-07-10 13:23:20 -07:00