Commit graph

346 commits

Author SHA1 Message Date
Hendrik van Antwerpen
a1a241b013 Expose quantifiers per pattern, instead of merging for all patterns in a query 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
1d513bcf67 Rewrite quantifier oeprations
- Simplify control flow by having a single return at the end of the function.
- Follow enum order for case order.
2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
1f1a449c76 Improve capture quantifier computation
Compute quantifiers in a bottom-up manner, which allows more precise
results for alternations, where the quantifiers are now precisly joined.
2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
9bac066330 Deal with quantifiers appearing on capture's enclosing patterns
- Use a proper enum type for quantifiers.
- Drop quantifiers from `TSQueryStep`, which was not used.
- Keep track of the captures introduced during a pattern parse, and
  apply the quantifier for the pattern to the captures that were
  introduced by the pattern or any sub patterns.
- Use 'quantifier' instead of 'suffix'.
2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
ae7869d1a6 Expose capture suffixes in queries 2022-01-11 18:33:36 +01:00
Andrew Helwer
5a6530a413 Added tests 2022-01-11 12:05:37 -05:00
Alex Pinkus
679a841183 Add pointer indirection to AnalysisStateSet
Profiling the `ts_query__analyze_patterns` function shows that it
spends a lot of time copying items in its various state sets. These
state sets are kept sorted, and the items are fairly large, so any time
that we insert new entries near the front of the array, a lot of calls
to memcpy must occur.

In advance of more sophisticated rework, one easy win is to hide the
large `AnalysisStateSet` objects behind pointers, so that the size of
each item in the list goes from 68 to 8 bytes, and add an object pool to
reuse allocations. This shows a significant performance improvement for
grammars that have a lot of states in them.
2022-01-10 20:07:14 -08:00
Andrew Helwer
80c34d62ab Fixed rust build, updated docs 2022-01-07 10:36:25 -05:00
Andrew Helwer
3ab6d1b937 Improve diff further 2022-01-07 10:17:53 -05:00
Andrew Helwer
bfb692d2f7 Improve diff 2022-01-07 10:16:20 -05:00
Andrew Helwer
ace81f6267 Don't log when counting codepoints 2022-01-07 10:13:57 -05:00
Andrew Helwer
0a52e90b01 Fixed pointer type 2022-01-07 10:13:57 -05:00
Andrew Helwer
75aa295b66 get_column now counts codepoints 2022-01-07 10:13:57 -05:00
Max Brunsfeld
3b7c4e62d2 🎨 subtree.h 2021-12-30 16:33:26 -08:00
Max Brunsfeld
622359b400 Simplify allocation-recording in test suite using new ts_set_allocator API 2021-12-30 16:09:07 -08:00
Max Brunsfeld
e01ea9ff51
Merge pull request #1544 from mkvoya/dynamic-allocator
Allow to change the allocator dynamically
2021-12-28 13:39:11 -08:00
Florian Märkl
d5d99e0bfb Address feedback 2021-12-24 17:07:32 +01:00
Florian Märkl
2024f27534 Make SubtreeInlineData work on Big-Endian 2021-12-24 16:47:10 +01:00
Mingkai Dong
8e4d4ef8b9 Replace allocator struct with function pointers 2021-12-24 09:28:23 +08:00
Max Brunsfeld
ddeaa0c7f5
Merge pull request #1483 from furunkel/patch-1
Don't use zero maxlen for snprintf in ts_subtree__write_to_string
2021-12-23 10:52:20 -08:00
Mingkai Dong
486ea2569d Avoid allocator from being switched more than once 2021-12-18 16:45:18 +08:00
Mingkai Dong
578bf74bf3 Add TSAllocator and ts_set_allocator in api.h 2021-12-18 09:53:58 +08:00
Mingkai Dong
b516f96f37 Fix declaration of ts_toggle_allocation_recording 2021-12-18 00:33:49 +08:00
Mingkai Dong
e742186c25 Allow to change the allocator dynamically 2021-12-17 20:16:20 +08:00
Max Brunsfeld
25f64e1eb6 Place tighter limits on the work done during query analysis 2021-12-09 22:18:21 -08:00
Max Brunsfeld
26dac9b2dd Fix query bugs revealed by randomized tests
* Fix bugs related to named wildcard patterns vs regular wildcard patterns.
* Fix handling of extra nodes during query analysis. Previously, the
expected child_index was updated incorrectly after an extra node,
leading to false "impossible pattern" errors.
* Refine logic for avoiding unnecessary state-splitting due to fallible steps.
Compute *two* different analysis results related to step fallibility:
  * `root_pattern_guaranteed` which, like before, summarizes whether the
    entire pattern is guaranteed to match once this step is reached.
  * `parent_pattern_guaranteed` - which just indicates whether the
    immediate parent pattern is guaranteed. This is now used when
    deciding whether it's necessary to split a match state.
2021-11-21 12:02:58 -08:00
Max Brunsfeld
fea3eca312 Improve query execution logging 2021-11-21 11:39:29 -08:00
Max Brunsfeld
142f4b6438 Rename Query::step_is_definite -> is_pattern_guaranteed_at_step 2021-11-21 11:37:52 -08:00
Max Brunsfeld
4e2e059865 Ensure 'extra' bit is set correctly when reusing a node
Fixes #1444
2021-11-19 12:43:55 -08:00
Max Brunsfeld
1fe0420f0f Avoid unnecessary stack entries in query analysis
When descending into a hidden child rule, the current stack entry
can be reused if it is currently at the end of its rule.

This fixes a test failure when analyzing a Ruby query. The
failure was introduced due to some changes to the Ruby grammar.
This optimization allows us to impose a _smaller_ limit on
the stack size, which should make query analysis faster and
more memory-efficient.
2021-11-19 11:04:36 -08:00
furunkel
f78ad7162f
Don't use zero maxlen for snprintf in ts_subtree__write_to_string
It seems that (some implementations of?) `snprintf` returns -1 and sets `errno` to `EINVAL` if a `maxlen` of zero is passed. This causes the count to underflow and `ts_subtree__write_to_string` returns a gigantic size which the succeeding malloc will refuse to allocate.
2021-11-12 20:52:15 +01:00
Max Brunsfeld
ddb12dc0c6 query: Return error on unclosed tree pattern in alternation
Fixes #1436
2021-10-12 09:20:43 -07:00
Max Brunsfeld
22a5cfbe10 Assign ids to query matches only when the matches are returned
Refs #1372
2021-09-13 12:39:48 -07:00
Andrew Hlynskyi
52e6c900c3 fix(lib): fix segfault on ts_query_new with incompatible grammar version, close #1318 2021-09-03 14:24:18 +03:00
Andrew Hlynskyi
7f538170bf fix(parser): count rows in the debug log from 0 2021-07-15 11:47:14 +03:00
Cameron Forbis
9182ebef86 update set_included_ranges to modify extent if the current position is at the very beginning of the included range 2021-06-17 16:42:25 -07:00
Max Brunsfeld
ad8bd3c3f5
Merge pull request #1120 from claudi/cast-printed-pointers
Fix: cast pointers to `void *` when printing
2021-06-07 09:09:54 -07:00
Max Brunsfeld
f3ea60e23f Merge branch 'master' into query-cursor-api 2021-06-02 11:51:26 -07:00
Douglas Creager
cc20708a33 query: Minor cleanups 2021-06-02 14:16:04 -04:00
Douglas Creager
47f1af818a query: Remove bits.h 2021-06-02 14:14:57 -04:00
Douglas Creager
1f6eac555c query: Use uint32_t for capture list IDs 2021-06-02 13:19:52 -04:00
Douglas Creager
cd96552448 query: Allow configurable match limit
The default is now a whopping 64K matches, which "should be enough for
everyone".  You can use the new `ts_query_cursor_set_match_limit`
function to set this to a lower limit, such as the previous default of
32.
2021-06-02 11:30:55 -04:00
Max Brunsfeld
851f55afce Report non-rooted matches that intersect cursor's range restriction
Co-Authored-By: Nathan Sobo <nathan@zed.dev>
2021-05-28 11:58:38 -07:00
Max Brunsfeld
919e9745a6 Add ts_tree_cursor_goto_first_child_for_point function
This function (and the similar `ts_tree_cursor_goto_first_child_for_byte`)
allows you to efficiently seek the tree cursor to a given position,
exploiting the tree's internal balancing, without having to visit
all of the preceding siblings of each node.
2021-05-27 12:30:19 -07:00
Max Brunsfeld
fda35894d4 Stop matching new patterns past the end of QueryCursor's range
This restores the original signatures of the `set_byte_range` and
`set_point_range` functions. Now, the QueryCursor will properly report
matches that intersect, but are not fully contained by its range.

Co-Authored-By: Nathan Sobo <nathan@zed.dev>
2021-05-25 18:02:35 -07:00
Max Brunsfeld
f597cc6a75 Preserve matches that contain the QueryCursor's start byte
Co-Authored-By: Nathan Sobo <nathan@zed.dev>
Co-Authored-By: Antonio Scandurra <me@as-cii.com>
2021-05-25 13:06:24 -07:00
Max Brunsfeld
a61f25bc58 Add APIs for advancing a QueryCursor to an arbitrary position 2021-05-24 21:07:59 -07:00
Douglas Creager
78010722a4 query: Allow unlimited pending matches
Well, not completely unlimited — we're still using a 16-bit counter to
keep track of them.  But we longer have a static maximum of 32 pending
matches when executing a query.
2021-05-24 11:02:58 -04:00
Claudi Lleyda Moltó
8e4509e47b
Fix: cast pointers to void * when printing
To avoid undefined behaviour, pointers should be cast to `void *` when
printed with `%p`.
2021-05-21 12:36:58 +02:00
Niranjan Hasabnis
bd06b1a8b3 Merge branch 'nhasabni/ts_node_child_field_name' of https://github.com/nhasabni/tree-sitter into nhasabni/ts_node_child_field_name 2021-05-20 23:37:03 +00:00