tree-sitter

Author	SHA1	Message	Date
Max Brunsfeld	c8d7c16f87	Use out-of-context states when in error parse state	2016-03-02 20:56:05 -08:00
Max Brunsfeld	9b2e775b79	Store out-of-context states in the language struct	2016-03-02 20:56:05 -08:00
Max Brunsfeld	ffcd8b5c49	Generate C code for the in-progress symbols in each parse state	2016-03-02 20:56:05 -08:00
Max Brunsfeld	d4632ab9a9	Make the compile function plain C and take a JSON grammar	2016-01-11 12:33:48 -08:00
Max Brunsfeld	4b04afac5e	Control lexer's error-mode via explicit boolean argument Previously, the lexer would operate in error-mode (ignoring any garbage input until it found a valid token) if it was invoked in the 'error' state. Now that the error state is deduped with other lexical states, the lexer might be invoked in that state even when error-mode is not intended. This adds a third argument to `ts_lex` that explicitly sets the error-mode. This bug was unlikely to occur in any real grammars, but it caused the node-tree-sitter-compiler test suite to fail for some grammars with only one rule.	2015-12-30 09:43:12 -08:00
Max Brunsfeld	4ad1a666be	clang-format	2015-12-29 21:17:31 -08:00
Max Brunsfeld	97a281502e	Store parse table more compactly	2015-12-29 11:27:41 -08:00
Max Brunsfeld	2bcd2e4d00	Reuse fragile tokens that came from the current lex state	2015-12-21 16:04:11 -08:00
Max Brunsfeld	c495076adb	Record in parse table which actions can hide splits Suppose a parse state S has multiple actions for a terminal lookahead symbol A. Then during incremental parsing, while in state S, the parser should not reuse a non-terminal lookahead B where FIRST(B) contains A, because reusing B might prematurely discard one of the possible actions that a batch parser would have attempted in state S, upon seeing A as a lookahead.	2015-12-17 13:11:56 -08:00
Max Brunsfeld	66144dc28e	Treat tokens that are sometimes extra as fragile	2015-12-16 20:04:45 -08:00
Max Brunsfeld	d713054d61	Record which tokens are fragile when lexing	2015-12-10 21:05:54 -08:00
Max Brunsfeld	d2bf88d5fe	Include rows and columns in TSLength This way, we don't have to have separate 1D and 2D versions for so many values	2015-12-04 20:20:29 -08:00
Max Brunsfeld	22c76fc71b	Remove TSLength from runtime header Refactor node functions now that character offset and byte offset are stored separately	2015-12-04 10:45:30 -08:00
Max Brunsfeld	ad619d95f6	Add 'extra' field to symbol metadata This stores whether a symbol is only ever used as a ubiquitous token. This will allow ubiquitous nodes to be reused more effectively: if they are always ubiquitous, then they can be reused immediately, and otherwise, they must be broken down in case they need to be used structurally.	2015-12-02 15:10:24 -08:00
Max Brunsfeld	f08554e958	Replace NodeType enum with SymbolMetadata bitfield This will allow storing other metadata about symbols, like if they only appear as ubiquitous tokens	2015-12-02 15:10:24 -08:00
joshvera	b0f6bac3ab	replace start and end with padding and size	2015-11-18 16:34:50 -08:00
joshvera	e720922662	Add source info to TSLexer	2015-11-12 12:24:05 -05:00
Max Brunsfeld	7ee5eaa16a	Rename node accessor methods Instead of child() vs concrete_child(), next_sibling() vs next_concrete_sibling(), etc, the default is switched: child() refers to the concrete syntax tree, and named_child() refers to the AST. Because the AST is abstract through exclusion of some nodes, the names are clearer if the qualifier goes on the AST operations	2015-09-08 23:16:24 -07:00
Max Brunsfeld	c3f3f19ea8	Add concrete_child and concrete_child_count Node methods	2015-09-08 09:53:26 -07:00
Max Brunsfeld	9591c88f39	In runtime, distinguish between anonymous and hidden nodes	2015-09-06 00:12:37 -07:00
Max Brunsfeld	54e40b8146	Rework AST access API: reduce heap allocation	2015-07-31 15:47:48 -07:00
Max Brunsfeld	f9b057f3a9	clang-format everything	2015-07-27 18:29:48 -07:00
Max Brunsfeld	aff8bc3266	Split parse stack when there are multiple parse actions	2015-07-09 23:09:33 -07:00
Max Brunsfeld	755894b44d	Allow multiple parse actions in parse table	2015-06-18 17:03:16 -07:00
Max Brunsfeld	d5ce3a9b5a	lexer: in error mode, continue until token is found	2015-06-15 15:26:05 -07:00
Max Brunsfeld	2d436cf141	Identify fragile reductions at compile time	2015-02-21 15:11:03 -08:00
Max Brunsfeld	8cf800ef5d	Unify debugging API for parsing and lexing	2014-10-17 17:52:54 -07:00
Max Brunsfeld	7498725d7f	Move lexer debugging logic out of public header	2014-10-17 16:20:01 -07:00
Max Brunsfeld	5c600942df	Inline some helper functions for lexer	2014-10-17 15:22:01 -07:00
Max Brunsfeld	c594208ab8	Allow callbacks to be specified for debug output	2014-10-13 01:02:18 -07:00
Max Brunsfeld	6d37877e49	Tweak debugging output	2014-10-05 16:56:29 -07:00
Max Brunsfeld	e5ea4efb0b	Use stdbool.h	2014-10-03 16:06:08 -07:00
Max Brunsfeld	78c5fe8e02	clang-format	2014-10-03 15:44:21 -07:00
Max Brunsfeld	444188cb5f	Display characters > 255 as numbers in debug output	2014-09-27 16:00:27 -07:00
Max Brunsfeld	c1565c1aae	Track AST nodes' sizes in characters as well as bytes The `pos` and `size` functions for Nodes now return TSLength structs, which contain lengths in both characters and bytes. This is important for knowing the number of unicode characters in a Node.	2014-09-26 16:15:07 -07:00
Max Brunsfeld	f2e2102a25	Add missing import of stdint.h	2014-09-13 00:25:12 -07:00
Max Brunsfeld	141cbcfa02	Read unicode characters using utf8proc	2014-09-13 00:24:10 -07:00
Max Brunsfeld	68d6e242ee	Fix parsing of wildcard patterns at the ends of documents - Remove special EOF handling from lexer - Explicitly exclude the EOF character from all-inclusive character sets.	2014-09-11 13:10:23 -07:00
Max Brunsfeld	2e7ffb4d14	Tweak auto-format settings Prefer lines that exceed 80 characters by a small margin to line breaks in argument lists	2014-09-09 13:15:40 -07:00
Max Brunsfeld	c0a3f8d39c	Remove some macros from public parser header	2014-09-05 23:47:38 -07:00
Max Brunsfeld	9c0b5b5571	clang-format	2014-09-03 18:53:38 -07:00
Max Brunsfeld	77529ace3d	Fix infinite loop in certain cases w/ unterminated tokens	2014-09-03 00:38:44 -07:00
Max Brunsfeld	545e575508	Revert "Remove the separator characters construct" This reverts commit `5cd07648fd`. The separators construct is useful as an optimization. It turns out that constructing a node for every chunk of whitespace in a document causes a significant performance regression. Conflicts: src/compiler/build_tables/build_lex_table.cc src/compiler/grammar.cc src/runtime/parser.c	2014-09-02 08:03:51 -07:00
Max Brunsfeld	5cd07648fd	Remove the separator characters construct Now, grammars can handle whitespace by making it another ubiquitous token, like comments. For now, this has the side effect of whitespace being included in the tree that precedes it. This was already an issue for other ubiquitous tokens though, so it needs to be fixed anyway.	2014-09-01 20:19:43 -07:00
Max Brunsfeld	2985a98150	Build error nodes in lexer again, not in parser	2014-08-31 16:59:01 -07:00
Max Brunsfeld	226ffd6b5b	Fix initializer list deduction warnings in specs	2014-08-27 22:23:45 -07:00
Max Brunsfeld	e0a53b9f14	Make parse and lex debug output more readable	2014-08-27 18:27:53 -07:00
Max Brunsfeld	bd145d2c6a	Preserve the initial error node in handle_error function	2014-08-26 23:22:18 -07:00
Max Brunsfeld	37d5db6fee	Move newline in lexer debugging output	2014-08-26 22:21:21 -07:00
Max Brunsfeld	346cf4fe5d	Remove LEX_PANIC macro	2014-08-26 13:12:12 -07:00

1 2 3

127 commits