Commit graph

184 commits

Author SHA1 Message Date
Amaan Qureshi
8f73fb502f
Merge pull request #2408 from amaanq/codeql-bugs
fix(lib): explicitly cast numbers to the same size in potential spots for infinite loops
2023-07-19 16:11:43 -04:00
Amaan Qureshi
753fa1c3ff
fix(lib): explicitly cast numbers to the same size in potential spots for infinite loops 2023-07-19 03:49:14 -04:00
Samuel Moelius
a07cdb59f3
Handle edge cases involving consecutive "zero or" modifiers 2023-07-19 03:27:43 -04:00
Max Brunsfeld
40f7b2ec97 Fix parsing of queries that start with repetitions followed by alternatives 2023-07-18 17:57:52 -07:00
Amaan Qureshi
c16a8c71ce
fix: pass a value_id the same size of predicate_capture_ids's elements to avoid big-endian integer narrowing
This solves a bug on big-endian architectures where the value would be later passed by reference as an elements "view" before being inserted. The issue is it is casted as a void pointer, and when writing uint16_t's of size 1, only 2 of the 4 bytes are written. This is okay for little-endian systems, but not big-endian
2023-07-18 05:40:38 -04:00
Max Brunsfeld
356f68293a Fix false positive query match bug, introduced in #2085 2023-07-10 16:12:59 -04:00
Philipp Mildenberger
55a8db10cc fix: bug with first child group anchor (anchor had no effect) 2023-05-13 19:40:49 +03:00
Andrew Hlynskyi
4f4b86a40b lib: make query step init depend from MAX_STEP_CAPTURE_COUNT decl 2023-04-19 09:37:46 +03:00
Andrew Hlynskyi
d4d5e29c91 feat(lib): ts_query_cursor_set_max_start_depth - use 0 to reset 2023-04-17 11:16:04 +03:00
Lewis Russell
1e81a1b67f feat(lib): add ts_query_cursor_set_max_start_depth query API
This allows configuring cursors from traversing too deep into a tree.
2023-04-17 11:15:13 +03:00
Andrew Hlynskyi
4c2a36302b lib: fix OOB in query engine reported in #2162 2023-04-06 03:59:55 +03:00
Matt
65c16bfb17 query casts 2023-04-04 17:43:27 +03:00
Max Brunsfeld
837899e456 Add API for checking if a pattern in a query is non-local 2023-02-16 11:59:34 -08:00
Max Brunsfeld
40703f110c Fix bug in maintenance of query cursor's tree depth 2023-02-16 11:59:34 -08:00
Max Brunsfeld
fa869cf3ed Restructure query_cursor_advance to explicitly control which hidden nodes it descends into 2023-02-16 11:59:34 -08:00
Max Brunsfeld
189cf6d59d Group analysis state sets into QueryAnalysis struct 2023-02-16 11:59:34 -08:00
Max Brunsfeld
32ce1fccd0 Precompute the set of repetition symbols that can match rootless patterns 2023-02-16 11:59:34 -08:00
Matt
8751fa0853
Add explicit casting for array capacities 2022-09-21 15:52:44 -04:00
Max Brunsfeld
6b87326470
Merge pull request #1787 from kianmeng/fix-typos
Fix typos
2022-08-25 10:25:39 -07:00
Sebastian Lackner
2174288e30 query: Use uint16_t for production_id in AnalysisSubgraphNode struct 2022-07-26 21:50:38 +02:00
Max Brunsfeld
79eaa68793 Don't match nested wildcard patterns against error nodes 2022-07-07 18:11:52 -07:00
Max Brunsfeld
548c12fb88 Fix bug where patterns with top-level alternatives were not considered 'rooted' 2022-07-07 17:53:54 -07:00
Max Brunsfeld
1401767689 query: Don't attempt to match top-level sibling patterns directly in ERROR nodes
Co-authored-by: Keith Simmons <keith@zed.dev>
2022-07-07 15:27:00 -07:00
Kian-Meng Ang
b8552ec6c4 Fix typos 2022-06-28 19:57:42 +08:00
Max Brunsfeld
58b719541b Fix failure to match queries with wildcard at root with range restrictions 2022-06-22 15:54:06 -07:00
Max Brunsfeld
fce23d63b3
Merge pull request #1602 from the-mikedavis/md-ignore-future-matches-for-non-local-patterns
prevent future captures for `#is-not? local` matches
2022-01-19 16:40:30 -08:00
Michael Davis
716ef24578
remove unfinished queries from 'ts_query_cursor_remove_match' 2022-01-18 17:01:07 -06:00
Max Brunsfeld
aaf4572727
Merge pull request #1589 from alex-pinkus/deduplicate-core-ids
Ignore duplicate states when initializing subgraphs in `ts_query__analyze_patterns`
2022-01-17 13:54:31 -08:00
Alex Pinkus
eaf9b170f1 Don't start with duplicate states in ts_query__analyze_patterns
This change exposes a new `primary_state_ids` field on the `TSLanguage`
struct, and populates it by tracking the first encountered state with a
given `core_id`. (For posterity: the initial change just exposed
`core_id` and deduplicated within `ts_analyze_query`).

With this `primary_state_ids` field in place, the
`ts_query__analyze_patterns` function only needs to populate its
subgraphs with starting states that are _primary_, since non-primary
states behave identically to primary ones. This leads to large savings
across the board, since most states are not primary.
2022-01-16 11:17:47 -08:00
Max Brunsfeld
e96ee19901
Merge pull request #1504 from hendrikvanantwerpen/expose-capture-suffixes
Expose capture suffixes in queries
2022-01-14 12:11:25 -08:00
Hendrik van Antwerpen
9dace8f9fe Add explicit breaks to prevent fall through errors 2022-01-11 19:08:32 +01:00
Hendrik van Antwerpen
c76d8ee076 Represent quantifiers using bytes instead of ints 2022-01-11 18:41:33 +01:00
Hendrik van Antwerpen
70aee901ac Reduce error handling logic 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
99e74fa0f5 Move quantifier addition out of loop and drop conditional 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
93db863729 Remove obsolete FIXMEs 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
ec9b00e5c6 Handle multiple top-level alternations correctly 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
8b28f3a8c4 Shorten quantifier operations by using early returns 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
e338726cde Prefix globally visible TSquantifier values 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
ae2ac3c0db Initialize variable to silence compiler warnings 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
a1a241b013 Expose quantifiers per pattern, instead of merging for all patterns in a query 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
1d513bcf67 Rewrite quantifier oeprations
- Simplify control flow by having a single return at the end of the function.
- Follow enum order for case order.
2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
1f1a449c76 Improve capture quantifier computation
Compute quantifiers in a bottom-up manner, which allows more precise
results for alternations, where the quantifiers are now precisly joined.
2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
9bac066330 Deal with quantifiers appearing on capture's enclosing patterns
- Use a proper enum type for quantifiers.
- Drop quantifiers from `TSQueryStep`, which was not used.
- Keep track of the captures introduced during a pattern parse, and
  apply the quantifier for the pattern to the captures that were
  introduced by the pattern or any sub patterns.
- Use 'quantifier' instead of 'suffix'.
2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
ae7869d1a6 Expose capture suffixes in queries 2022-01-11 18:33:36 +01:00
Alex Pinkus
679a841183 Add pointer indirection to AnalysisStateSet
Profiling the `ts_query__analyze_patterns` function shows that it
spends a lot of time copying items in its various state sets. These
state sets are kept sorted, and the items are fairly large, so any time
that we insert new entries near the front of the array, a lot of calls
to memcpy must occur.

In advance of more sophisticated rework, one easy win is to hide the
large `AnalysisStateSet` objects behind pointers, so that the size of
each item in the list goes from 68 to 8 bytes, and add an object pool to
reuse allocations. This shows a significant performance improvement for
grammars that have a lot of states in them.
2022-01-10 20:07:14 -08:00
Max Brunsfeld
25f64e1eb6 Place tighter limits on the work done during query analysis 2021-12-09 22:18:21 -08:00
Max Brunsfeld
26dac9b2dd Fix query bugs revealed by randomized tests
* Fix bugs related to named wildcard patterns vs regular wildcard patterns.
* Fix handling of extra nodes during query analysis. Previously, the
expected child_index was updated incorrectly after an extra node,
leading to false "impossible pattern" errors.
* Refine logic for avoiding unnecessary state-splitting due to fallible steps.
Compute *two* different analysis results related to step fallibility:
  * `root_pattern_guaranteed` which, like before, summarizes whether the
    entire pattern is guaranteed to match once this step is reached.
  * `parent_pattern_guaranteed` - which just indicates whether the
    immediate parent pattern is guaranteed. This is now used when
    deciding whether it's necessary to split a match state.
2021-11-21 12:02:58 -08:00
Max Brunsfeld
fea3eca312 Improve query execution logging 2021-11-21 11:39:29 -08:00
Max Brunsfeld
142f4b6438 Rename Query::step_is_definite -> is_pattern_guaranteed_at_step 2021-11-21 11:37:52 -08:00
Max Brunsfeld
1fe0420f0f Avoid unnecessary stack entries in query analysis
When descending into a hidden child rule, the current stack entry
can be reused if it is currently at the end of its rule.

This fixes a test failure when analyzing a Ruby query. The
failure was introduced due to some changes to the Ruby grammar.
This optimization allows us to impose a _smaller_ limit on
the stack size, which should make query analysis faster and
more memory-efficient.
2021-11-19 11:04:36 -08:00