Commit graph

64 commits

Author SHA1 Message Date
Max Brunsfeld
68f43b5865 Make query syntax backward-compatible 2020-05-11 13:23:44 -07:00
Max Brunsfeld
85c998d572 Change the wildcard syntax in tree queries
1. Use '_' instead of '*'.
2. Add '*' as a postfix operator for zero-or-more repetitions

Signed-off-by: Patrick Thomson <patrickt@github.com>
2020-05-11 13:04:04 -07:00
Max Brunsfeld
40262483a9 Change query syntax for predicates
Signed-off-by: Patrick Thomson <patrickt@github.com>
2020-05-11 12:35:51 -07:00
Max Brunsfeld
9c0535cea6 Fix logic for aborting failed matches 2020-05-08 14:15:25 -07:00
Max Brunsfeld
b0671aea6a Reorder some code in ts_query_cursor__advance 2020-05-08 12:13:21 -07:00
Max Brunsfeld
b47c170c75 Query: fix bugs and add tests for top-level and nested repetitions 2020-05-08 12:10:01 -07:00
Max Brunsfeld
3ad71625dd Fix query bugs, expand and clean up query tests 2020-05-07 14:22:15 -07:00
Max Brunsfeld
1011be76b7 Handle trailing optional nodes in queries 2020-05-07 12:41:25 -07:00
Max Brunsfeld
3456a21f0d Start work on restructuring query implementation to deal w/ optionals and repeats better 2020-05-07 12:41:25 -07:00
Max Brunsfeld
322b311c2c Clear QueryCursor state between exec calls 2020-03-26 16:10:39 -07:00
Max Brunsfeld
65f2874b9e query: Optimize handling of patterns with a wildcard at the root
Avoid adding and removing states for these patterns on every node in the tree
by just skipping the wildcard step of the matching process
2020-03-16 14:02:31 -07:00
Max Brunsfeld
b5483c67ab query: allow repetition operator to be used on non-terminal nodes 2020-03-13 16:12:39 -07:00
Max Brunsfeld
6f636a0357 query: Add postfix '+' operator for token repetition
Co-Authored-By: Patrick Thomson <patrickt@users.noreply.github.com>
2020-03-12 15:10:58 -07:00
Max Brunsfeld
e3aad995f6 query: Fix handling of patterns with wildcards at the root 2020-03-11 13:14:16 -07:00
Max Brunsfeld
741eed01b7 query: Handle escape sequences and escaped quotes in string literals 2020-03-10 15:50:06 -07:00
Max Brunsfeld
570b83e2b2 query: Add immediate child operator 2020-02-19 11:47:52 -08:00
Max Brunsfeld
950a89a525 query: Differentiate between wildcard '*' and named wildcard '(*)' 2020-02-19 09:42:29 -08:00
Max Brunsfeld
1d6ea51b63 query: Make * operator only match named nodes 2020-02-18 21:32:52 -08:00
Max Brunsfeld
de8e3ee188 query: Allow multiple captures on a single node 2020-02-11 16:02:32 -08:00
Max Brunsfeld
f3747863df Add ts_query_disable_pattern API 2020-01-15 17:08:55 -08:00
Max Brunsfeld
3c4a24752b Tweak naming of TSQuery's pattern map variables 2020-01-15 17:08:07 -08:00
Maxim Sukharev
edb5693100 include language.h in query.c (#507)
Building `query.c` requires `TREE_SITTER_LANGUAGE_VERSION_WITH_SYMBOL_DEDUPING` which is defined in `language.h`.

It produces an error:
```
query.c:744:40: error: use of undeclared identifier 'TREE_SITTER_LANGUAGE_VERSION_WITH_SYMBOL_DEDUPING'
```

when building with cgo.
2019-12-16 09:38:18 -08:00
Max Brunsfeld
6d1d8cc217 query: Skip workaround code path when using new symbol map field 2019-12-06 12:11:45 -08:00
Max Brunsfeld
56c620c005 Store a mapping to ensure no two symbols map to the same metadata 2019-12-05 17:21:46 -08:00
Maxim Sukharev
a647de1ef5
add missing unicode include to query.c
it causes problems with building tree-sitter with cgo
2019-11-28 01:32:41 +01:00
Max Brunsfeld
e3f6b1a1af Query - If too many states, kill the one w/ the earliest capture 2019-11-22 11:54:12 -08:00
Damien Guard
599e4f0ec4
Fix a few compiler warnings 2019-11-20 10:21:10 -08:00
Max Brunsfeld
ce633a85c6 Improve ts_language_symbol_for_name function 2019-11-15 14:21:13 -08:00
Max Brunsfeld
e14e285a10 cli: Check queries when running tree-sitter test 2019-10-18 14:44:16 -07:00
Max Brunsfeld
fa43ce01a6 Allow queries to capture ERROR nodes 2019-10-16 11:54:32 -07:00
Max Brunsfeld
f490befcde Add ts_query_disable_capture API 2019-10-14 12:30:22 -07:00
Max Brunsfeld
4c17af3ecd Allow queries with no patterns 2019-10-14 12:30:22 -07:00
Max Brunsfeld
c153711539 query: Avoid splitting states on nodes that don't contain captures 2019-10-14 12:30:22 -07:00
Matthew Krupcale
ee9a3c0ebb lib: remove utf8proc dependency (#436)
* Remove dependency on utf8proc

This removes the only external dependency on utf8proc for UTF-8 decoding. It does so by implementing its own UTF-8 decoder. This decoder is both faster and has a simpler API.

 * .gitmodules: remove utf8proc submodule
 * docs/section-2-using-parsers.md: remove requirement for utf8proc submodule
 * docs/section-6-contributing.md: likewise
 * lib/Cargo.toml: remove utf8proc subdirectory package include
 * lib/README.md: remove utf8proc subdirectory description
 * lib/binding_rust/build.rs: remove utf8proc compiler include directory
 * lib/src/lexer.c: remove utf8proc dependencies and types
 * lib/src/lib.c: remove utf8proc dependency
 * lib/src/unicode.h: define types for Unicode decoders
 * lib/src/utf16.{c,h}: implement more readable UTF-16 decoder
 * lib/src/utf8.{c,h}: implement fast UTF-8 decoder
 * lib/utf8proc: remove utf8proc submodule directory
 * script/build-lib: remove utf8proc compiler include directory
 * script/build-wasm: likewise

* Optimize ts_lexer__get_lookahead.

Try to favor non-failure code path and assign lookahead values directly to lexer

 * lib/src/lexer.c: optimize for non-failure code path

* Fix some compiler errors

 * lib/src/lexer.c: cast from signed to unsigned for decode_next result
 * lib/src/utf16.c: fix non-constant initializers for older compilers

* Remove some missed remnants of utf8proc

 * docs/section-2-using-parsers.md: only two include paths necessary now
 * lib/src/lib.c: no need to define UTF8PROC_STATIC

* Use ICU's utf8 and utf16 decoding routines

* Remove unnecessary casts when calling icu macros

* Check buffer length before attempting to decode a unicode character

* Use new unicode function when parsing Queries

Co-Authored-By: Matthew Krupcale <mkrupcale@matthewkrupcale.com>

* Mark libicu files as vendored for GitHub's stats
2019-10-14 11:18:39 -07:00
Max Brunsfeld
9872a083b7 rust: Change QueryCursor::captures to expose the full match 2019-10-03 12:45:58 -07:00
Max Brunsfeld
cb87b7b76e Fix invalid read by query cursor on error nodes
🎩 @bfredl

Refs https://github.com/tree-sitter/tree-sitter/pull/448#issuecomment-536337749
2019-10-01 11:28:51 -07:00
Björn Linse
1d2d043390 fix compiler warning with comparing char with `TSSymbolType' 2019-09-30 19:24:40 +02:00
Max Brunsfeld
b15e90bd26 Handle set! predicate function in queries 2019-09-24 11:54:24 -07:00
Max Brunsfeld
ff9a2c1f53 Make queries work in languages with simple aliases 2019-09-24 11:54:24 -07:00
Björn Linse
15e3bc7fd2 Fix some compiler warnings regarding function prototypes 2019-09-22 11:49:44 +02:00
Max Brunsfeld
a6b6a681ec Fix a bug that prevented early termination of query matches 2019-09-18 16:13:10 -07:00
Max Brunsfeld
186b08381c Terminate failed query matches before descending whenever possible
When iterating over captures, this prevents reasonable queries from 
forcing the tree cursor to buffer matches unnecessarily.
2019-09-18 11:37:49 -07:00
Max Brunsfeld
374a7ac81e Ensure that duplicate captures are ordered by pattern index 2019-09-17 16:27:16 -07:00
Max Brunsfeld
82955759c0 Add an API for getting a pattern's start offset in the source code 2019-09-17 16:19:58 -07:00
Max Brunsfeld
fdd3a34e70 Fix some comments 2019-09-17 15:05:12 -07:00
Max Brunsfeld
2d1ca8bc9f Fix match return order fom ts_query_cursor_next_match 2019-09-17 14:52:27 -07:00
Max Brunsfeld
1af85dc3f7 Remove unused APIs, expand docs for predicate API 2019-09-16 15:00:32 -07:00
Max Brunsfeld
7793bf2a5a Clean up query code 2019-09-16 11:33:22 -07:00
Max Brunsfeld
d4d554b2ae Add wasm bindings for predicates 2019-09-16 10:25:44 -07:00
Max Brunsfeld
096126d039 Allow predicates in queries, to match on nodes' text 2019-09-15 22:06:51 -07:00