Commit graph

3931 commits

Author SHA1 Message Date
Max Brunsfeld
9866674cf8
Merge pull request #1660 from alex-pinkus/expanded-regex-support
Expand regex support to include emojis and binary ops
2022-02-24 17:14:23 -08:00
Max Brunsfeld
5eb0a3090f
Merge pull request #1547 from the-mikedavis/md-test-tags
test tags queries in 'tree-sitter test'
2022-02-24 15:22:15 -08:00
Patrick Thomson
3bd6fae4cc
Merge pull request #1649 from tree-sitter/tag-name-conventions
Describe tagging and associated naming conventions for syntax captures.
2022-02-24 09:26:00 -05:00
Max Brunsfeld
af00782dfd Add files needed for using clangd 2022-02-22 09:44:50 -08:00
Max Brunsfeld
d08f1af15c 🎨 2022-02-22 09:43:57 -08:00
Max Brunsfeld
be71c6e3e9
Merge pull request #1567 from jamessan/config-min-serde-json-ver
config: Bump minimum serde_json version to 1.0.45
2022-02-21 19:47:03 -08:00
Alex Pinkus
8fadf18655 Expand regex support to include emojis and binary ops
The `Emoji` property alias is already present, but the actual property
is not available since it lives in a new file. This adds that file to
the `generate-unicode-categories-json`.

The `emoji-data` file follows the same format as the ones we already
consume in `generate-unicode-categories-json`, so adding emoji support
is fairly easy. his, grammars would need to hard-code a set of
unicode ranges in their own regex. The Javascript library `emoji-regex`
cannot be used because of #451.

For unclear reasons, the characters #, *, and 0-9 are marked as
`Emoji=Yes` by `emoji-data.txt`. Because of this, a grammar that wishes
to use emojis is likely to want to exclude those characters. For that
reason, this change also adds support for binary operations in regexes,
e.g. `[\p{Emoji}&&[^#*0-9]]`.

Lastly (and perhaps controversially), this change introduces new
variables available at grammar compile time, for the major, minor, and
patch versions of the tree-sitter CLI used to compile the grammar. This
will allow grammars to conditionally adopt these new regex features
while remaining backward compatible with older versions of the CLI.
Without this part of the change, grammar authors who do not precompile
and check-in their `grammar.json` would need to wait for downstream
systems to adopt a newer tree-sitter CLI version before they could begin
to use these features.
2022-02-19 11:41:36 -08:00
Patrick Thomson
764c8c88ca last tweaks 2022-02-18 09:24:04 -05:00
Patrick Thomson
27019d1172 demonstrate that select-adjacent works 2022-02-17 18:28:09 -05:00
Patrick Thomson
65da86f16f Missing plural here. 2022-02-17 18:11:01 -05:00
Patrick Thomson
48748ee332 Typo. 2022-02-17 18:05:50 -05:00
Patrick Thomson
e1ac2e2648 Better nomenclature. 2022-02-17 18:05:19 -05:00
Patrick Thomson
4c60217345 Flesh out output. 2022-02-17 17:43:14 -05:00
Patrick Thomson
69a5f77eab Describe how to use tree-sitter tags as well. 2022-02-17 17:34:15 -05:00
Patrick Thomson
1fbace136d Add examples. 2022-02-17 17:20:21 -05:00
Patrick Thomson
70077b8205 Incorporate @dcreager's excellent suggestions. 2022-02-17 14:00:34 -05:00
Patrick Thomson
f41e13f5da Spacing and word choice. 2022-02-11 15:41:53 -05:00
Patrick Thomson
88822bd3fc Move this to its own page. 2022-02-11 15:25:50 -05:00
Patrick Thomson
302c8b5305 Move this inside the query section. 2022-02-11 11:25:11 -05:00
Patrick Thomson
8dfed40466 Describe naming conventions for syntax captures. 2022-02-11 11:17:18 -05:00
Max Brunsfeld
5ef4ef4e2e lib: 0.20.4 2022-02-04 13:13:21 -08:00
Max Brunsfeld
84c1c6a271
Merge pull request #1640 from tree-sitter/goto-to-first-child-for-byte-gte
Change cursor `goto_first_child_for_{byte,point}` methods to treat nodes' ranges inclusively
2022-02-04 13:12:14 -08:00
Max Brunsfeld
cb4317ba8e Change goto_first_child_for_{byte,point} to compare nodes' ranges inclusively
Co-Authored-By: Antonio Scandurra <me@as-cii.com>
2022-02-04 12:38:33 -08:00
Max Brunsfeld
714bfd47a7 0.20.4 2022-01-23 10:46:04 -08:00
Max Brunsfeld
994cb61f2c Always generate parser.h, regardless of chosen ABI version
For some ABI changes, we may need to make changes to the parser.h in order
to restore a previous binary format, but for the current range of supported
ABI versions (13 + 14), the current parser.h is fine.

Refs #1599
2022-01-23 10:29:52 -08:00
Max Brunsfeld
3ff5c19403 0.20.3 2022-01-21 16:36:45 -08:00
Max Brunsfeld
fab8540508 libs: 0.20.3 2022-01-21 16:35:37 -08:00
Max Brunsfeld
82ceebc10d 🎨 Use base struct syntax to clean up grammar expectations 2022-01-20 17:17:46 -08:00
Max Brunsfeld
584b55df8d Delete unused code, tweak whitespace 2022-01-19 16:54:57 -08:00
Max Brunsfeld
fce23d63b3
Merge pull request #1602 from the-mikedavis/md-ignore-future-matches-for-non-local-patterns
prevent future captures for `#is-not? local` matches
2022-01-19 16:40:30 -08:00
Michael Davis
02abc2a063
add test for removals in eager query matches 2022-01-18 20:54:55 -06:00
Michael Davis
a3609aa07e
remove non-local query matches for locals 2022-01-18 17:04:00 -06:00
Michael Davis
716ef24578
remove unfinished queries from 'ts_query_cursor_remove_match' 2022-01-18 17:01:07 -06:00
Michael Davis
51354ef776
use just an i32 to ignore match IDs 2022-01-17 22:20:05 -06:00
Michael Davis
83ef0aea12
prevent future matches for '#is-not? local' patterns 2022-01-17 22:03:09 -06:00
Max Brunsfeld
2346570901
Merge pull request #1601 from alex-pinkus/fix-abi-14-struct-ordering
Fix ABI 14 back compat by moving primary_field_ids to the end
2022-01-17 18:02:35 -08:00
Alex Pinkus
858ea5782b Fix back compat by moving primary_field_ids to the end
Due to an oversight in #1589, I added `primary_field_ids` into the
`TSLanguage` struct in a place that wasn't the end. This is not actually
backwards compatible and causes downstream failures :(
2022-01-17 17:23:02 -08:00
Max Brunsfeld
691469c783
Merge pull request #1599 from tree-sitter/abi-version-flag
Add `--abi` flag to the generate command, generate version 13 by default
2022-01-17 15:41:28 -08:00
Max Brunsfeld
516fd6f6de Add --abi flag to generate command, generate version 13 by default 2022-01-17 14:50:47 -08:00
Max Brunsfeld
aaf4572727
Merge pull request #1589 from alex-pinkus/deduplicate-core-ids
Ignore duplicate states when initializing subgraphs in `ts_query__analyze_patterns`
2022-01-17 13:54:31 -08:00
Alex Pinkus
eaf9b170f1 Don't start with duplicate states in ts_query__analyze_patterns
This change exposes a new `primary_state_ids` field on the `TSLanguage`
struct, and populates it by tracking the first encountered state with a
given `core_id`. (For posterity: the initial change just exposed
`core_id` and deduplicated within `ts_analyze_query`).

With this `primary_state_ids` field in place, the
`ts_query__analyze_patterns` function only needs to populate its
subgraphs with starting states that are _primary_, since non-primary
states behave identically to primary ones. This leads to large savings
across the board, since most states are not primary.
2022-01-16 11:17:47 -08:00
Max Brunsfeld
e96ee19901
Merge pull request #1504 from hendrikvanantwerpen/expose-capture-suffixes
Expose capture suffixes in queries
2022-01-14 12:11:25 -08:00
Max Brunsfeld
9df064c9fe
Merge pull request #1581 from tlaplus-community/get-codepoint-column
`get_column` now counts codepoints instead of bytes
2022-01-12 10:49:44 -08:00
Andrew Helwer
e1ee261181 Changed decimal unicode codepoint to hex 2022-01-11 19:15:36 -05:00
Hendrik van Antwerpen
9dace8f9fe Add explicit breaks to prevent fall through errors 2022-01-11 19:08:32 +01:00
Hendrik van Antwerpen
c76d8ee076 Represent quantifiers using bytes instead of ints 2022-01-11 18:41:33 +01:00
Hendrik van Antwerpen
70aee901ac Reduce error handling logic 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
acd3d32c36 Remove reference to strings from quantifier-only function 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
99e74fa0f5 Move quantifier addition out of loop and drop conditional 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
93db863729 Remove obsolete FIXMEs 2022-01-11 18:33:36 +01:00