tree-sitter

Author	SHA1	Message	Date
Max Brunsfeld	cb784975a4	Add IMMEDIATE_TOKEN rule type, for enforcing no preceding extras	2018-08-01 14:00:57 -07:00
Max Brunsfeld	0dd41f0d74	Restore logic for restricting keyword tokens Removing this restriction created problems for the Rust grammar, and possibly others. The proper fix would be to ensure that the 'word token' matches every possible string that a 'keyword token' matches, as opposed to just matching some of the same strings. This would require us to gather a little more information about how tokens conflict. For now, I'm just going to put back the hard-coded logic that we had.	2018-06-15 13:15:02 -07:00
Max Brunsfeld	c39f0e9ef9	Rename word_rule -> word_token	2018-06-15 09:15:12 -07:00
Max Brunsfeld	91e3bc3e55	Update parse state merging logic for explicit word tokens Co-Authored-By: Ashi Krishnan <queerviolet@github.com>	2018-06-14 12:32:27 -07:00
Max Brunsfeld	190456d7ec	Fix logging during lex table construction Co-Authored-By: Ashi Krishnan <queerviolet@github.com>	2018-06-14 12:03:40 -07:00
Max Brunsfeld	e17cd42e47	Perform keyword optimization using explicitly selected word token rather than trying to infer the word token automatically. Co-Authored-By: Ashi Krishnan <queerviolet@github.com>	2018-06-14 09:35:54 -07:00
Max Brunsfeld	45c52f9459	Allow keywords to contain numbers, as long as they start w/ a letter	2018-05-25 21:28:47 -07:00
Max Brunsfeld	356d5e0221	Generalize logic for finding a keyword capture token	2018-05-25 15:29:15 -07:00
Max Brunsfeld	915978aa9d	Avoid redundant logging of conflicting tokens	2018-05-24 16:22:16 -07:00
Max Brunsfeld	6fca8f2f4d	Make ts_compile_grammar take an optional log file, start logging to it	2018-05-24 16:01:14 -07:00
Max Brunsfeld	e8cfb9ced0	Remove incorrect return statement This prevented conflicts between some tokens from being recorded properly. In the case of JavaScript, it prevented tree-sitter from recognizing the conflict between the forward slash operator and the regex token, allowing regexes to be merged into parse states containing '/' incorrectly. Refs tree-sitter/tree-sitter-javascript#71	2018-04-17 17:14:36 -07:00
Max Brunsfeld	65e654ea9b	Remove overly conservative check for the validity of keyword capture tokens	2018-04-05 13:25:16 -07:00
Max Brunsfeld	fb348c0f1e	Fix signed/unsigned comparison warning	2018-03-28 11:04:49 -07:00
Max Brunsfeld	e917756ad1	Remove depends_on_lookahead field from parse table entries This simplifies the logic for determining whether a token is reusable and makes it more conservative. It should fix some incremental parsing bugs that are being caught by the randomized tests on CI.	2018-03-28 10:58:33 -07:00
Max Brunsfeld	186f70649c	Consolidate the unify for detecting conflicting tokens	2018-03-28 10:03:09 -07:00
Max Brunsfeld	a8bc67ac42	Allow LookaheadSet::for_each to terminate early	2018-03-28 10:03:09 -07:00
Max Brunsfeld	43e14332ed	Avoid creating duplicate metadata rules	2018-03-28 10:03:09 -07:00
Max Brunsfeld	72849787b1	Fix logic for identifying keyword capture token	2018-03-12 07:52:57 -07:00
Max Brunsfeld	53cd89c614	Ensure keyword capture tokens aren't too loosely defined	2018-03-07 14:46:11 -08:00
Max Brunsfeld	c0cc35ff07	Create separate lexer function for keywords	2018-03-07 12:00:26 -08:00
Max Brunsfeld	32ef3e001a	Account for epsilon external tokens when merging parse states Do not merge a token T into a parse state S if S contains external tokens that can be followed by tokens that could be shadowed by T. At this point, the only automated test for this logic is via the bash grammar, in which the `]` token should not be merged into states in which `_concat` is valid, because `_concat` can be followed by a `_special_characters` token, and `]` would shadow `_special_characters`.	2018-02-28 14:47:04 -08:00
Max Brunsfeld	ba607a1f84	Optimize lex state merging	2017-09-18 13:40:37 -07:00
Max Brunsfeld	4c9c05806a	Merge compatible starting token states before constructing lex table	2017-09-05 13:21:53 -07:00
Max Brunsfeld	9d668c5004	Move incompatible token map into LexTableBuilder	2017-08-31 15:46:37 -07:00
Max Brunsfeld	c285fbef38	Clear LexTableBuilder's state after detecting conflicts	2017-08-25 17:11:39 -07:00
Max Brunsfeld	9d616b3bf8	Replace size_t -> LexStateId in LexTableBuilder::remove_duplicate_states	2017-08-08 12:55:35 -07:00
Max Brunsfeld	5bd5b4bb05	Replace <cctype> -> <cwctype>	2017-07-10 14:35:14 -07:00
Max Brunsfeld	1586d70cbe	Compute conflicting tokens more precisely While generating the parse table, keep track of which tokens can follow one another. Then use this information to evaluate token conflicts more precisely. This will result in a smaller parse table than the previous, overly-conservative approach.	2017-07-07 17:54:24 -07:00
Max Brunsfeld	8517313a45	🎨	2017-06-22 15:33:07 -07:00
Max Brunsfeld	8157b81b68	Improve logic for short-circuiting trivial lexing conflict detection	2017-06-22 15:33:01 -07:00
Max Brunsfeld	2c043803f1	Be more conservative about avoiding lexing conflicts when merging states This fixes a bug in the C++ grammar where the `>>` token was merged into a state where it was previously not valid, but the `>` token was valid. This caused nested templates like - std::vector<std::pair<int, int>> to not parse correctly.	2017-06-22 15:32:13 -07:00
Max Brunsfeld	b3edd8f749	Remove use of shared_ptr in choice, repeat, and seq factories	2017-03-17 14:28:13 -07:00
Max Brunsfeld	db4b9ebc7c	Implement Rule as a union rather than an abstract base class	2017-03-17 13:29:31 -07:00
Max Brunsfeld	64e9230071	Use LexTableBuilder to detect conflicts between tokens more correctly	2017-03-08 12:47:38 -08:00

34 commits