tree-sitter

Author	SHA1	Message	Date
Jake Sarjeant	61b70943b1	feat(cli): add option to select JS runtime other than node	2023-08-03 21:34:47 +03:00
Amaan Qureshi	b8fe5fe21b	fix: do not allow eof to advance states if the new state is the same state	2023-08-02 10:47:27 +01:00
Andrew Hlynskyi	a2f834d846	More error contexts + conv panics to errors with context	2023-07-30 21:16:45 +03:00
Amaan Qureshi	f4e788b28e	feat: warn when unused conflicts are present in a grammar	2023-07-28 00:23:28 -04:00
Amaan Qureshi	c521e9c18e	chore: improve error message in some spots loading `grammar.json`	2023-07-24 00:44:44 -04:00
Amaan Qureshi	5fba369c4a	fix: disallow inlining the first rule This prevents a panic when indexing symbol_ids during the generation process	2023-07-19 16:14:58 -04:00
Andrew Hlynskyi	0b0cc6c429	Fix rustc 1.71.0 warnings	2023-07-13 17:50:04 +03:00
Andreas Deininger	0751736d17	docs: convert various links to https protocol	2023-04-04 18:05:46 +03:00
Max Brunsfeld	6b87326470	Merge pull request #1787 from kianmeng/fix-typos Fix typos	2022-08-25 10:25:39 -07:00
Nat Mote	4e3179fbc0	Avoid extracting default alias for extras Fixes #1834	2022-08-10 07:27:34 -07:00
Kian-Meng Ang	b8552ec6c4	Fix typos	2022-06-28 19:57:42 +08:00
Max Brunsfeld	4b93326898	Don't generate primary states array if it will be unused due to abi version setting	2022-03-02 14:57:59 -08:00
Max Brunsfeld	9866674cf8	Merge pull request #1660 from alex-pinkus/expanded-regex-support Expand regex support to include emojis and binary ops	2022-02-24 17:14:23 -08:00
Alex Pinkus	8fadf18655	Expand regex support to include emojis and binary ops The `Emoji` property alias is already present, but the actual property is not available since it lives in a new file. This adds that file to the `generate-unicode-categories-json`. The `emoji-data` file follows the same format as the ones we already consume in `generate-unicode-categories-json`, so adding emoji support is fairly easy. his, grammars would need to hard-code a set of unicode ranges in their own regex. The Javascript library `emoji-regex` cannot be used because of #451. For unclear reasons, the characters #, , and 0-9 are marked as `Emoji=Yes` by `emoji-data.txt`. Because of this, a grammar that wishes to use emojis is likely to want to exclude those characters. For that reason, this change also adds support for binary operations in regexes, e.g. `[\p{Emoji}&&[^#0-9]]`. Lastly (and perhaps controversially), this change introduces new variables available at grammar compile time, for the major, minor, and patch versions of the tree-sitter CLI used to compile the grammar. This will allow grammars to conditionally adopt these new regex features while remaining backward compatible with older versions of the CLI. Without this part of the change, grammar authors who do not precompile and check-in their `grammar.json` would need to wait for downstream systems to adopt a newer tree-sitter CLI version before they could begin to use these features.	2022-02-19 11:41:36 -08:00
Max Brunsfeld	994cb61f2c	Always generate parser.h, regardless of chosen ABI version For some ABI changes, we may need to make changes to the parser.h in order to restore a previous binary format, but for the current range of supported ABI versions (13 + 14), the current parser.h is fine. Refs #1599	2022-01-23 10:29:52 -08:00
Max Brunsfeld	82ceebc10d	🎨 Use base struct syntax to clean up grammar expectations	2022-01-20 17:17:46 -08:00
Alex Pinkus	858ea5782b	Fix back compat by moving primary_field_ids to the end Due to an oversight in #1589, I added `primary_field_ids` into the `TSLanguage` struct in a place that wasn't the end. This is not actually backwards compatible and causes downstream failures :(	2022-01-17 17:23:02 -08:00
Max Brunsfeld	516fd6f6de	Add --abi flag to generate command, generate version 13 by default	2022-01-17 14:50:47 -08:00
Alex Pinkus	eaf9b170f1	Don't start with duplicate states in `ts_query__analyze_patterns` This change exposes a new `primary_state_ids` field on the `TSLanguage` struct, and populates it by tracking the first encountered state with a given `core_id`. (For posterity: the initial change just exposed `core_id` and deduplicated within `ts_analyze_query`). With this `primary_state_ids` field in place, the `ts_query__analyze_patterns` function only needs to populate its subgraphs with starting states that are _primary_, since non-primary states behave identically to primary ones. This leads to large savings across the board, since most states are not primary.	2022-01-16 11:17:47 -08:00
Max Brunsfeld	86b408412c	Use serde's derive feature everywhere	2021-11-21 13:39:30 -08:00
Max Brunsfeld	a0c085bbec	Return an error when trying to inline a token Fixes #1420	2021-11-19 13:02:04 -08:00
Max Brunsfeld	d05c665863	Convert some of the fixture grammars from JSON to JS These tests are easier to write and maintain if the grammars are just JS, like grammars normally are. It doesn't slow the tests down significantly to shell out to `node` for each of these grammars.	2021-10-22 18:47:23 -06:00
Razze	956705a23d	Update to unicode standard 14	2021-10-10 16:40:31 +02:00
FnControlOption	e030434ca7	Handle aliases in unicode property escapes in regexes	2021-08-18 22:22:46 -07:00
Paul Gey	a533e4d7bb	Remove unnecessary borrows This produces an `unused_must_use` warning on nightly: https://github.com/rust-lang/rust/pull/86426	2021-08-14 15:44:24 +02:00
Max Brunsfeld	c6dd5da5e6	Merge pull request #1329 from narpfel/improve-performance Improve performance of `tree-sitter generate`	2021-08-11 16:08:23 -07:00
Paul Gey	965e3c9e5e	`Generator::add_parse_table`: Store entries in hash map This avoids a quadratic behaviour due to repeatedly using `find` on a growing `Vec`.	2021-08-08 21:45:43 +02:00
Paul Gey	cf69a2c94c	Use `IndexMap` and `FxHash` for some hot hash maps	2021-08-08 21:45:43 +02:00
Andrew Hlynskyi	533073cdb5	fix(cli): Remove tree-sitter grammar ./... call limitation	2021-08-06 02:11:35 +03:00
Max Brunsfeld	c512a0eed7	Merge pull request #1194 from ahlinc/fix/1032 Close #1032 - fix all weirdness in the generated Cargo.toml	2021-06-29 16:48:23 -07:00
Andrew Hlynskyi	f22d62393b	fix(cli): actual Rust binding version in generated Cargo.toml	2021-06-30 00:36:11 +03:00
Andrew Hlynskyi	d3527109a8	Updating of binding.gyp should depend on its content instead of bindings/node folder	2021-06-23 02:42:48 +03:00
Andrew Hlynskyi	22d63338a2	Use double quoted patterns for more precise pattern matching in the binding.gyp files	2021-06-23 02:41:30 +03:00
Andrew Hlynskyi	86b8137457	Add create_path_else fn to handle creation or modification	2021-06-23 02:40:32 +03:00
Andrew Hlynskyi	797c7668c1	feat(cli): Independant language binding files generation	2021-06-23 02:39:38 +03:00
Andrew Hlynskyi	4578e58794	fix(cli): close #1032 - fix repository template url generation in cargo.toml	2021-06-23 01:02:29 +03:00
Douglas Creager	d2d01e77e3	cli: Use anyhow and thiserror for errors This patch updates the CLI to use anyhow and thiserror for error management. The main feature that our custom `Error` type was providing was a _list_ of messages, which would allow us to annotate "lower-level" errors with more contextual information. This is exactly what's provided by anyhow's `Context` trait. (This is setup work for a future PR that will pull the `config` and `loader` modules out into separate crates; by using `anyhow` we wouldn't have to deal with a circular dependency between with the new crates.)	2021-06-09 16:17:23 -04:00
Andrew Hlynskyi	3c0152a331	chore(fmt): Apply 'cargo fmt' to the whole code base	2021-05-19 23:21:43 +03:00
Markus F.X.J. Oberhumer	cc519b3121	cli: Improve const-correctness of the generated parsers (part 2 of 2). This is a follow-up to my previous commit `1badd131f9` . I've made this an extra patch as it requires a minor API change in <tree_sitter/parser.h>. This commit moves the remaining generated tables into the read-only segment. Before: $ for f in bash c cpp go html java javascript jsdoc json php python ruby rust; do \ gcc -o $f.o -O2 -Ilib/include -c test/fixtures/grammars/$f/src/parser.c; \ done $ size --totals .o text data bss dec hex filename 5353477 24472 0 5377949 520f9d (TOTALS) After: $ for f in bash c cpp go html java javascript jsdoc json php python ruby rust; do \ gcc -o $f.o -O2 -Ilib/include -c test/fixtures/grammars/$f/src/parser.c; \ done $ size --totals .o 5378147 0 0 5378147 521063 (TOTALS)	2021-05-19 12:49:57 +02:00
Andrew Hlynskyi	b856f7e1bd	Remove unneeded dead_code annotations	2021-04-30 06:55:00 +03:00
Markus F.X.J. Oberhumer	1badd131f9	cli: Improve const-correctness of the generated parsers. This moves most of the generated tables from the data segment into the text segment (read-only memory) so that it can be shared between different processes. As a bonus side effect we can also remove all casts in the generated parsers. Before: size --totals target/scratch/.so text data bss dec hex filename 853623 4684560 2160 5540343 5489f7 (TOTALS) After: size --totals target/scratch/.so text data bss dec hex filename 5472086 68616 480 5541182 548d3e (TOTALS)	2021-04-27 09:22:18 +02:00
Andrew Hlynskyi	7aa538dd97	fix(cli): use dashed language name in generated package.json and Cargo.toml files	2021-04-22 16:29:48 +03:00
Andrew Hlynskyi	9416f975d3	fix(cli): set actual cli version in generated package.json	2021-04-22 16:29:48 +03:00
an-kumar	aabe6100d0	Update generated Cargo.toml's tree-sitter dependency tree-sitter 0.19.0 bumped the language version from 12 to 13. `npm install tree-sitter-cli` gets a recent version of tree-sitter, which generates languages with language version 13. However, the Cargo.toml generated from `tree-sitter generate` still has a an old tree-sitter as a dependency. This causes the rust bindings to not work out of the box, as the tree-sitter library expects language version 12. It would be nice to add a test for this in CI. `tree-sitter generate` already creates a test for the rust binding, and that test fails out of the box due to the language mismatch.	2021-04-09 10:59:51 -07:00
Max Brunsfeld	c3eb5daa31	Include has_preceding_inherited_fields in Item's hash impl	2021-03-27 10:08:24 -07:00
Max Brunsfeld	57036b4f8a	Extract lexer helper functions for all large char sets No need to restrict it to char sets used in multiple places. This is important because the helper functions are now implemented more efficiently than the inline comparisons (using a binary search).	2021-03-11 11:48:48 -08:00
Andrew Hlynskyi	a331607f4e	dsl.js: Reuse sym() in RuleBuilder	2021-03-10 23:06:53 +02:00
Max Brunsfeld	9e50befcf8	For node-types.json, process supertypes in a stable order	2021-03-08 12:02:01 -08:00
Max Brunsfeld	8e894ff3f1	Add --no-bindings flag to generate subcommand	2021-03-08 12:01:45 -08:00
Max Brunsfeld	7300249d20	Fix incorrect merging of states with different inherited fields Co-authored-by: Douglas Creager <dcreager@dcreager.net>	2021-03-05 14:49:28 -08:00

1 2 3 4 5 ...

257 commits