Commit graph

61 commits

Author SHA1 Message Date
Max Brunsfeld
9c0535cea6 Fix logic for aborting failed matches 2020-05-08 14:15:25 -07:00
Max Brunsfeld
b0671aea6a Reorder some code in ts_query_cursor__advance 2020-05-08 12:13:21 -07:00
Max Brunsfeld
b47c170c75 Query: fix bugs and add tests for top-level and nested repetitions 2020-05-08 12:10:01 -07:00
Max Brunsfeld
3ad71625dd Fix query bugs, expand and clean up query tests 2020-05-07 14:22:15 -07:00
Max Brunsfeld
1011be76b7 Handle trailing optional nodes in queries 2020-05-07 12:41:25 -07:00
Max Brunsfeld
3456a21f0d Start work on restructuring query implementation to deal w/ optionals and repeats better 2020-05-07 12:41:25 -07:00
Max Brunsfeld
322b311c2c Clear QueryCursor state between exec calls 2020-03-26 16:10:39 -07:00
Max Brunsfeld
65f2874b9e query: Optimize handling of patterns with a wildcard at the root
Avoid adding and removing states for these patterns on every node in the tree
by just skipping the wildcard step of the matching process
2020-03-16 14:02:31 -07:00
Max Brunsfeld
b5483c67ab query: allow repetition operator to be used on non-terminal nodes 2020-03-13 16:12:39 -07:00
Max Brunsfeld
6f636a0357 query: Add postfix '+' operator for token repetition
Co-Authored-By: Patrick Thomson <patrickt@users.noreply.github.com>
2020-03-12 15:10:58 -07:00
Max Brunsfeld
e3aad995f6 query: Fix handling of patterns with wildcards at the root 2020-03-11 13:14:16 -07:00
Max Brunsfeld
741eed01b7 query: Handle escape sequences and escaped quotes in string literals 2020-03-10 15:50:06 -07:00
Max Brunsfeld
570b83e2b2 query: Add immediate child operator 2020-02-19 11:47:52 -08:00
Max Brunsfeld
950a89a525 query: Differentiate between wildcard '*' and named wildcard '(*)' 2020-02-19 09:42:29 -08:00
Max Brunsfeld
1d6ea51b63 query: Make * operator only match named nodes 2020-02-18 21:32:52 -08:00
Max Brunsfeld
de8e3ee188 query: Allow multiple captures on a single node 2020-02-11 16:02:32 -08:00
Max Brunsfeld
f3747863df Add ts_query_disable_pattern API 2020-01-15 17:08:55 -08:00
Max Brunsfeld
3c4a24752b Tweak naming of TSQuery's pattern map variables 2020-01-15 17:08:07 -08:00
Maxim Sukharev
edb5693100 include language.h in query.c (#507)
Building `query.c` requires `TREE_SITTER_LANGUAGE_VERSION_WITH_SYMBOL_DEDUPING` which is defined in `language.h`.

It produces an error:
```
query.c:744:40: error: use of undeclared identifier 'TREE_SITTER_LANGUAGE_VERSION_WITH_SYMBOL_DEDUPING'
```

when building with cgo.
2019-12-16 09:38:18 -08:00
Max Brunsfeld
6d1d8cc217 query: Skip workaround code path when using new symbol map field 2019-12-06 12:11:45 -08:00
Max Brunsfeld
56c620c005 Store a mapping to ensure no two symbols map to the same metadata 2019-12-05 17:21:46 -08:00
Maxim Sukharev
a647de1ef5
add missing unicode include to query.c
it causes problems with building tree-sitter with cgo
2019-11-28 01:32:41 +01:00
Max Brunsfeld
e3f6b1a1af Query - If too many states, kill the one w/ the earliest capture 2019-11-22 11:54:12 -08:00
Damien Guard
599e4f0ec4
Fix a few compiler warnings 2019-11-20 10:21:10 -08:00
Max Brunsfeld
ce633a85c6 Improve ts_language_symbol_for_name function 2019-11-15 14:21:13 -08:00
Max Brunsfeld
e14e285a10 cli: Check queries when running tree-sitter test 2019-10-18 14:44:16 -07:00
Max Brunsfeld
fa43ce01a6 Allow queries to capture ERROR nodes 2019-10-16 11:54:32 -07:00
Max Brunsfeld
f490befcde Add ts_query_disable_capture API 2019-10-14 12:30:22 -07:00
Max Brunsfeld
4c17af3ecd Allow queries with no patterns 2019-10-14 12:30:22 -07:00
Max Brunsfeld
c153711539 query: Avoid splitting states on nodes that don't contain captures 2019-10-14 12:30:22 -07:00
Matthew Krupcale
ee9a3c0ebb lib: remove utf8proc dependency (#436)
* Remove dependency on utf8proc

This removes the only external dependency on utf8proc for UTF-8 decoding. It does so by implementing its own UTF-8 decoder. This decoder is both faster and has a simpler API.

 * .gitmodules: remove utf8proc submodule
 * docs/section-2-using-parsers.md: remove requirement for utf8proc submodule
 * docs/section-6-contributing.md: likewise
 * lib/Cargo.toml: remove utf8proc subdirectory package include
 * lib/README.md: remove utf8proc subdirectory description
 * lib/binding_rust/build.rs: remove utf8proc compiler include directory
 * lib/src/lexer.c: remove utf8proc dependencies and types
 * lib/src/lib.c: remove utf8proc dependency
 * lib/src/unicode.h: define types for Unicode decoders
 * lib/src/utf16.{c,h}: implement more readable UTF-16 decoder
 * lib/src/utf8.{c,h}: implement fast UTF-8 decoder
 * lib/utf8proc: remove utf8proc submodule directory
 * script/build-lib: remove utf8proc compiler include directory
 * script/build-wasm: likewise

* Optimize ts_lexer__get_lookahead.

Try to favor non-failure code path and assign lookahead values directly to lexer

 * lib/src/lexer.c: optimize for non-failure code path

* Fix some compiler errors

 * lib/src/lexer.c: cast from signed to unsigned for decode_next result
 * lib/src/utf16.c: fix non-constant initializers for older compilers

* Remove some missed remnants of utf8proc

 * docs/section-2-using-parsers.md: only two include paths necessary now
 * lib/src/lib.c: no need to define UTF8PROC_STATIC

* Use ICU's utf8 and utf16 decoding routines

* Remove unnecessary casts when calling icu macros

* Check buffer length before attempting to decode a unicode character

* Use new unicode function when parsing Queries

Co-Authored-By: Matthew Krupcale <mkrupcale@matthewkrupcale.com>

* Mark libicu files as vendored for GitHub's stats
2019-10-14 11:18:39 -07:00
Max Brunsfeld
9872a083b7 rust: Change QueryCursor::captures to expose the full match 2019-10-03 12:45:58 -07:00
Max Brunsfeld
cb87b7b76e Fix invalid read by query cursor on error nodes
🎩 @bfredl

Refs https://github.com/tree-sitter/tree-sitter/pull/448#issuecomment-536337749
2019-10-01 11:28:51 -07:00
Björn Linse
1d2d043390 fix compiler warning with comparing char with `TSSymbolType' 2019-09-30 19:24:40 +02:00
Max Brunsfeld
b15e90bd26 Handle set! predicate function in queries 2019-09-24 11:54:24 -07:00
Max Brunsfeld
ff9a2c1f53 Make queries work in languages with simple aliases 2019-09-24 11:54:24 -07:00
Björn Linse
15e3bc7fd2 Fix some compiler warnings regarding function prototypes 2019-09-22 11:49:44 +02:00
Max Brunsfeld
a6b6a681ec Fix a bug that prevented early termination of query matches 2019-09-18 16:13:10 -07:00
Max Brunsfeld
186b08381c Terminate failed query matches before descending whenever possible
When iterating over captures, this prevents reasonable queries from 
forcing the tree cursor to buffer matches unnecessarily.
2019-09-18 11:37:49 -07:00
Max Brunsfeld
374a7ac81e Ensure that duplicate captures are ordered by pattern index 2019-09-17 16:27:16 -07:00
Max Brunsfeld
82955759c0 Add an API for getting a pattern's start offset in the source code 2019-09-17 16:19:58 -07:00
Max Brunsfeld
fdd3a34e70 Fix some comments 2019-09-17 15:05:12 -07:00
Max Brunsfeld
2d1ca8bc9f Fix match return order fom ts_query_cursor_next_match 2019-09-17 14:52:27 -07:00
Max Brunsfeld
1af85dc3f7 Remove unused APIs, expand docs for predicate API 2019-09-16 15:00:32 -07:00
Max Brunsfeld
7793bf2a5a Clean up query code 2019-09-16 11:33:22 -07:00
Max Brunsfeld
d4d554b2ae Add wasm bindings for predicates 2019-09-16 10:25:44 -07:00
Max Brunsfeld
096126d039 Allow predicates in queries, to match on nodes' text 2019-09-15 22:06:51 -07:00
Max Brunsfeld
86205b9e6d Fix infinite loop on unterminated string in query 2019-09-13 15:19:21 -07:00
Max Brunsfeld
a1fec71b19 Tweak QueryCursor to allow iterating either matches or captures
For syntax highlighting, we want to iterate over all of the captures in 
order, and don't care about grouping the captures by pattern.
2019-09-13 15:19:04 -07:00
Max Brunsfeld
33587c924a Remove an unused field, clean up some comments 2019-09-12 17:00:01 -07:00