tree-sitter

Author	SHA1	Message	Date
Alex Pinkus	8fadf18655	Expand regex support to include emojis and binary ops The `Emoji` property alias is already present, but the actual property is not available since it lives in a new file. This adds that file to the `generate-unicode-categories-json`. The `emoji-data` file follows the same format as the ones we already consume in `generate-unicode-categories-json`, so adding emoji support is fairly easy. his, grammars would need to hard-code a set of unicode ranges in their own regex. The Javascript library `emoji-regex` cannot be used because of #451. For unclear reasons, the characters #, , and 0-9 are marked as `Emoji=Yes` by `emoji-data.txt`. Because of this, a grammar that wishes to use emojis is likely to want to exclude those characters. For that reason, this change also adds support for binary operations in regexes, e.g. `[\p{Emoji}&&[^#0-9]]`. Lastly (and perhaps controversially), this change introduces new variables available at grammar compile time, for the major, minor, and patch versions of the tree-sitter CLI used to compile the grammar. This will allow grammars to conditionally adopt these new regex features while remaining backward compatible with older versions of the CLI. Without this part of the change, grammar authors who do not precompile and check-in their `grammar.json` would need to wait for downstream systems to adopt a newer tree-sitter CLI version before they could begin to use these features.	2022-02-19 11:41:36 -08:00
Max Brunsfeld	2346570901	Merge pull request #1601 from alex-pinkus/fix-abi-14-struct-ordering Fix ABI 14 back compat by moving primary_field_ids to the end	2022-01-17 18:02:35 -08:00
Alex Pinkus	858ea5782b	Fix back compat by moving primary_field_ids to the end Due to an oversight in #1589, I added `primary_field_ids` into the `TSLanguage` struct in a place that wasn't the end. This is not actually backwards compatible and causes downstream failures :(	2022-01-17 17:23:02 -08:00
Max Brunsfeld	691469c783	Merge pull request #1599 from tree-sitter/abi-version-flag Add `--abi` flag to the generate command, generate version 13 by default	2022-01-17 15:41:28 -08:00
Max Brunsfeld	516fd6f6de	Add --abi flag to generate command, generate version 13 by default	2022-01-17 14:50:47 -08:00
Max Brunsfeld	aaf4572727	Merge pull request #1589 from alex-pinkus/deduplicate-core-ids Ignore duplicate states when initializing subgraphs in `ts_query__analyze_patterns`	2022-01-17 13:54:31 -08:00
Alex Pinkus	eaf9b170f1	Don't start with duplicate states in `ts_query__analyze_patterns` This change exposes a new `primary_state_ids` field on the `TSLanguage` struct, and populates it by tracking the first encountered state with a given `core_id`. (For posterity: the initial change just exposed `core_id` and deduplicated within `ts_analyze_query`). With this `primary_state_ids` field in place, the `ts_query__analyze_patterns` function only needs to populate its subgraphs with starting states that are _primary_, since non-primary states behave identically to primary ones. This leads to large savings across the board, since most states are not primary.	2022-01-16 11:17:47 -08:00
Max Brunsfeld	e96ee19901	Merge pull request #1504 from hendrikvanantwerpen/expose-capture-suffixes Expose capture suffixes in queries	2022-01-14 12:11:25 -08:00
Max Brunsfeld	9df064c9fe	Merge pull request #1581 from tlaplus-community/get-codepoint-column `get_column` now counts codepoints instead of bytes	2022-01-12 10:49:44 -08:00
Andrew Helwer	e1ee261181	Changed decimal unicode codepoint to hex	2022-01-11 19:15:36 -05:00
Hendrik van Antwerpen	9dace8f9fe	Add explicit breaks to prevent fall through errors	2022-01-11 19:08:32 +01:00
Hendrik van Antwerpen	c76d8ee076	Represent quantifiers using bytes instead of ints	2022-01-11 18:41:33 +01:00
Hendrik van Antwerpen	70aee901ac	Reduce error handling logic	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	acd3d32c36	Remove reference to strings from quantifier-only function	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	99e74fa0f5	Move quantifier addition out of loop and drop conditional	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	93db863729	Remove obsolete FIXMEs	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	ec9b00e5c6	Handle multiple top-level alternations correctly	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	8b28f3a8c4	Shorten quantifier operations by using early returns	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	e338726cde	Prefix globally visible TSquantifier values	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	36f2440369	Complete comment	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	ae2ac3c0db	Initialize variable to silence compiler warnings	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	a1a241b013	Expose quantifiers per pattern, instead of merging for all patterns in a query	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	1d513bcf67	Rewrite quantifier oeprations - Simplify control flow by having a single return at the end of the function. - Follow enum order for case order.	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	1f1a449c76	Improve capture quantifier computation Compute quantifiers in a bottom-up manner, which allows more precise results for alternations, where the quantifiers are now precisly joined.	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	9bac066330	Deal with quantifiers appearing on capture's enclosing patterns - Use a proper enum type for quantifiers. - Drop quantifiers from `TSQueryStep`, which was not used. - Keep track of the captures introduced during a pattern parse, and apply the quantifier for the pattern to the captures that were introduced by the pattern or any sub patterns. - Use 'quantifier' instead of 'suffix'.	2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen	ae7869d1a6	Expose capture suffixes in queries	2022-01-11 18:33:36 +01:00
Andrew Helwer	69ff091a87	Added includes for macos	2022-01-11 12:31:41 -05:00
Max Brunsfeld	bf210f0c9e	Merge pull request #1578 from alex-pinkus/analysis-state-set-pointers Improve `ts_query_new` performance using pointer indirection	2022-01-11 09:13:36 -08:00
Andrew Helwer	5a6530a413	Added tests	2022-01-11 12:05:37 -05:00
Alex Pinkus	679a841183	Add pointer indirection to AnalysisStateSet Profiling the `ts_query__analyze_patterns` function shows that it spends a lot of time copying items in its various state sets. These state sets are kept sorted, and the items are fairly large, so any time that we insert new entries near the front of the array, a lot of calls to memcpy must occur. In advance of more sophisticated rework, one easy win is to hide the large `AnalysisStateSet` objects behind pointers, so that the size of each item in the list goes from 68 to 8 bytes, and add an object pool to reuse allocations. This shows a significant performance improvement for grammars that have a lot of states in them.	2022-01-10 20:07:14 -08:00
Andrew Helwer	80c34d62ab	Fixed rust build, updated docs	2022-01-07 10:36:25 -05:00
Andrew Helwer	3ab6d1b937	Improve diff further	2022-01-07 10:17:53 -05:00
Andrew Helwer	bfb692d2f7	Improve diff	2022-01-07 10:16:20 -05:00
Andrew Helwer	ace81f6267	Don't log when counting codepoints	2022-01-07 10:13:57 -05:00
Andrew Helwer	0a52e90b01	Fixed pointer type	2022-01-07 10:13:57 -05:00
Andrew Helwer	75aa295b66	get_column now counts codepoints	2022-01-07 10:13:57 -05:00
Max Brunsfeld	e81976d27a	Merge pull request #1570 from hickford/patch-1 Add link to Protocol Buffers grammar	2022-01-03 09:41:21 -08:00
Max Brunsfeld	b2fe125213	Merge pull request #1571 from 414owen/add-realloc-to-wasm-exports Add realloc to wasm exports	2022-01-03 09:01:13 -08:00
Owen Shepherd	1aa6541476	Add realloc to wasm exports	2022-01-03 16:07:39 +00:00
M Hickford	4c6175b70a	Add link to Protocol Buffers grammar	2022-01-02 21:18:12 +00:00
Max Brunsfeld	4ee52ee99e	0.20.2	2021-12-31 17:23:08 -08:00
Max Brunsfeld	5d8a1ace56	web: 0.20.2	2021-12-30 17:14:04 -08:00
Max Brunsfeld	f010781efa	lib: 0.20.2	2021-12-30 16:35:21 -08:00
Max Brunsfeld	3b7c4e62d2	🎨 subtree.h	2021-12-30 16:33:26 -08:00
Max Brunsfeld	8df0b8de7e	Convert more fixture grammars from JSON to JS	2021-12-30 16:27:02 -08:00
Max Brunsfeld	622359b400	Simplify allocation-recording in test suite using new ts_set_allocator API	2021-12-30 16:09:07 -08:00
Max Brunsfeld	e01ea9ff51	Merge pull request #1544 from mkvoya/dynamic-allocator Allow to change the allocator dynamically	2021-12-28 13:39:11 -08:00
Max Brunsfeld	0a85746cc4	Merge pull request #1473 from thestr4ng3r/big-endian Make SubtreeInlineData work on Big-Endian	2021-12-26 21:03:26 -08:00
Florian Märkl	d5d99e0bfb	Address feedback	2021-12-24 17:07:32 +01:00
Florian Märkl	2024f27534	Make SubtreeInlineData work on Big-Endian	2021-12-24 16:47:10 +01:00

1 2 3 4 5 ...

3890 commits