Commit graph

583 commits

Author SHA1 Message Date
Riley Bruins
6850df969d fix(query): prevent cycles when analyzing hidden children
**Problem:** `query.c` compares the current analysis state with the
previous analysis state to see if they are equal, so that it can return
early if so. This prevents redundant work. However, the comparison
function here differs from the one used for sorted insertion/lookup in
that it does not check any state data other than the child index. This
is problematic because it leads to infinite analysis when hidden nodes
have cycles.

**Solution:** Remove the custom comparison function, and apply the
insertion/lookup comparison function in place of it.

**NOTE:** This commit also changes the comparison function slightly, so
that some comparisons are reordered. Namely, for performance, it returns
early if the lhs depth is less than the rhs depth. Is this acceptable?
Tests still pass and nothing hangs in my testing, but it still seems
sketchy. Returning early if the lhs depth is greater than the rhs depth
does seem to make query analysis hang, weirdly enough... Keeping the
depth checks at the end of the loop also works, but it introduces a
noticeable performance regression (for queries that otherwise wouldn't
have had analysis cycles, of course).
2025-07-30 00:41:01 -04:00
Alex Aron
aeab755033
fix(lib): add wasm32 support to portable/endian.h (#4607) 2025-07-14 17:47:40 +02:00
Riley Bruins
6cabd9e67f fix(query)!: assert that predicates end in ! or ?
Predicates/directives are documented to end in either `!` or `?`.
However, `query.c` allows them to be any valid identifier, and also
allows `?` or `!` characters anywhere inside an identifier.

This commit removes `?` and `!` from the list of valid identifier
characters, and asserts that predicates/directives only *end* in `?` or
`!`, respectively.

This commit is breaking because you can no longer do something like
`(#eq? @capture foo!bar)` (`foo!bar` must now be quoted).
2025-06-06 10:34:00 +02:00
Will Lillis
8bd923ab9e fix(lib): replace raw array accesses with array_get 2025-06-05 00:53:11 -04:00
Max Brunsfeld
2ab9c9b590
Fully fix field underflow in go_to_previous_sibling (#4483)
Co-authored-by: Conrad Irwin <conrad.irwin@gmail.com>
2025-06-02 15:34:25 -07:00
Max Brunsfeld
f91255a201
Fix crash w/ goto_previous_sibling when parent node has leading extra child (#4472)
* Fix crash w/ goto_previous_sibling when parent node has leading extra
child Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>

Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>

* Fix lint

Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>

---------

Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>
2025-05-27 16:56:33 -07:00
Haoxiang Fei
06537fda83 fix: wasi has endian.h 2025-05-24 12:27:13 +02:00
Mike Zeller
4339b0fe05 illumos has endian.h 2025-05-15 09:53:45 +02:00
Will Lillis
31b9717ca3 fix(lib): return early for empty predicate step slice 2025-05-11 08:57:30 -04:00
Will Lillis
b1d2b7cfb8 fix(query): correct last_child_step_index in cases where a new step
wasn't created.

This fixes an OOB access to `self.steps` when a last child anchor
immediately follows a predicate.
2025-05-03 17:27:37 -04:00
Amaan Qureshi
21c658a12c fix(lib): do not access the alias sequence for the end subtree in ts_subtree_summarize_children 2025-04-28 23:13:13 -04:00
Riley Bruins
733d7513af fix(lib): reset parser options after use
**Problem:** After `ts_parser_parse_with_options()`, the parser options
are still stored in the parser object, meaning that a successive call to
`ts_parser_parse()` will actually behave like
`ts_parser_parse_with_options()`, which is not obvious and can have
unintended consequences.

**Solution:** Reset to empty options state after
`ts_parser_parse_with_options()`.
2025-04-14 21:35:40 -04:00
NOT XVilka
a00fab7dc4
fix(lib): remove duplicate TSLanguageMetadata typedef (#4268) 2025-03-06 14:14:25 -08:00
Max Brunsfeld
066fd77d39
Fix cases where error recovery could infinite loop (#4257)
* Rename corpus test functions to allow easy filtering by language

* Use usize for seed argument

* Avoid retaining useless stack versions when reductions merge

We found this problem when debugging an infinite loop that happened
during error recovery when using the Zig grammar. The large number of
unnecessary paused stack versions were preventing the correct recovery
strategy from being tried.

* Fix leaked lookahead token when reduction results in a merged stack

* Enable running PHP tests in CI

* Fix possible infinite loop during error recovery at EOF

* Account for external scanner state changes when detecting changed ranges in subtrees
2025-03-04 13:50:56 -08:00
Max Brunsfeld
2bd400dcee
Reset result_symbol field of lexer in wasm memory in between invocations (#4218) 2025-02-17 17:36:46 -08:00
Max Brunsfeld
dedcc5255a
Ignore external tokens that are zero-length and extra (#4213)
Co-authored-by: Anthony <anthony@zed.dev>
2025-02-17 15:07:44 -08:00
Max Brunsfeld
14b8ead412
Fix crash when loading languages w/ old ABI via wasm (#4210) 2025-02-17 13:56:53 -08:00
Thomas Klausner
14647b2a38 build: add a comment explaining why we undef _POSIX_C_SOURCE 2025-02-02 17:14:28 -05:00
Thomas Klausner
5311904619 build: fix compilation on NetBSD a different way 2025-02-02 17:14:28 -05:00
Riley Bruins
9ad096ef22 fix(lib): prevent finished_tree assertion failure
**Problem:** When resetting the parser during subtree balancing, an
error is thrown:

```
parser.c:2198: ts_parser_parse: Assertion `self->finished_tree.ptr' failed.
```

**Solution:** Reset `canceled_balancing` to false in
`ts_parser_reset()`.
2025-02-01 16:19:14 -05:00
Christian Clason
36f5f7918f fix(endian): rely on system headers where possible
Problem: Macros (re)defined in `endian.h` conflict with system headers
on FreeBSD (at least).

Solution: Rely on system `endian.h` on OpenBSD, FreeBSD, NetBSD, and
DragonFly

Ref. https://github.com/mikepb/endian.h/issues/4
2025-01-25 13:28:46 -05:00
Allan Clements
cda634a1c4 feat: add error information in the progress callback
This allows users to bail parsing if an error was *definitely* detected
using the progress callback, as all possible stack versions have a
non-zero error cost.

Co-authored-by: Amaan Qureshi <amaanq12@gmail.com>
2025-01-25 02:47:39 -05:00
Amaan Qureshi
8bb1448a6f feat: add the semantic version to TSLanguage, and expose an API for retrieving it 2025-01-25 01:14:30 -05:00
Amaan Qureshi
6e88672dac chore: cleanup unused code 2025-01-21 01:17:03 -05:00
Amaan Qureshi
c8353a52af fix(lib): don't always clear the tree stack
Only do so if the parser is not resuming balancing
2025-01-21 00:31:34 -05:00
Amaan Qureshi
9365586cc3 feat: allow parser balancing to be cancellable 2025-01-20 23:52:19 -05:00
Amaan Qureshi
344a88c4fb feat(lib)!: remove ts_node_child_containing_descendant
It was marked deprecated in 0.24
2025-01-12 22:11:30 -05:00
Amaan Qureshi
5de314833f feat(query): structurally verify supertype queries 2025-01-12 13:04:10 -05:00
Amaan Qureshi
7953aba070 fix(lib): use inclusive range check for non-empty nodes in next sibling computation 2025-01-10 22:00:33 -05:00
Amaan Qureshi
0195bbf1b4 fix(lib): avoid OOB access when updating alternative steps 2025-01-10 19:41:43 -05:00
Lucas Marçal
aea3a4720a fix(endian): support POSIX mode on Apple platforms 2025-01-06 01:13:04 -05:00
Lucas Marçal
28d5272e71 build(swift): include all source files 2025-01-06 01:13:04 -05:00
Riley Bruins
19482834bd feat: add Supertype API
Introduces a new function that takes in a supertype symbol and returns
all associated subtypes. Can be used by query.c to give better errors
for invalid subtypes, as well as downstream applications like the query
LSP to give better diagnostics.
2025-01-05 00:14:09 -05:00
Amaan Qureshi
efc51a596c fix(lib): don't consider unfinished captures definite when the following step is immediate
When collecting captures, we were treating unfinished ones as definite
even if they had pending immediate steps that weren't yet satisfied. Now
we only mark a capture as definite if the pattern is guaranteed and
there are no pending immediate steps to check.
2025-01-04 02:03:41 -05:00
Amaan Qureshi
5f379da544 fix(lib): prevent wildcards from incorrectly marking child patterns as infallible
When a pattern appears under a wildcard parent (like "(_ (expr))"), we
were incorrectly marking it as infallible. The parent_pattern_guaranteed
flag only means the pattern will match after finding the right wildcard
parent, not that any wildcard parent will work.
2025-01-03 23:09:49 -05:00
Amaan Qureshi
a7e6d01144 fix(lib): propagate last_child status to pattern alternatives in queries
Previously, when a pattern was marked as the last child in a query, its
alternatives weren't marked similarly, causing incorrect matching
behavior. Now, the `last_child` status is properly propagated through
all alternatives.
2025-01-03 21:13:29 -05:00
Amaan Qureshi
22f67e2b67 fix(query): ensure immediate matches for any node when an anchor follows a wildcard node 2024-12-29 00:54:16 -05:00
Amaan Qureshi
694d636322 fix(lib): correct fix for parsing hang with ranges containing empty points
It's more correct to check the bytes of the `size` length, rather than
use the point as a condition for resetting the lexer's token start
position
2024-12-25 04:49:39 -05:00
Amaan Qureshi
f3d50f273b fix(lib): add saturating subtraction to prevent integer underflow 2024-12-25 04:49:39 -05:00
Max Brunsfeld
201b41cf11
feat: add 'reserved word' construct
Co-authored-by: Amaan Qureshi <amaanq12@gmail.com>
2024-12-23 03:06:32 -05:00
Will Lillis
2a63077cac
style: correct typos 2024-12-23 02:11:09 -05:00
Amaan Qureshi
8744a4e3f2 feat(lib): use const for TSCharacterRanges 2024-12-23 01:19:10 -05:00
Riley Bruins
495fe2a6c5
feat: support querying missing nodes
Co-authored-by: Amaan Qureshi <amaanq12@gmail.com>
2024-12-14 14:57:36 -05:00
Amaan Qureshi
69d977d736 fix(lib): use clock_gettime on macOS again 2024-12-03 18:12:32 -05:00
Will Lillis
5d1be545c4
fix(lib): correct next sibling of zero width node 2024-11-12 18:17:45 -05:00
WillLillis
8c802da174 fix(lib): check point, byte ranges in node_descendant_for
functions
2024-11-02 03:06:07 -04:00
WillLillis
5b5cf5a5e5 fix(lib): check point, byte ranges in ts_query_cursor_set
range functions
2024-11-02 03:06:07 -04:00
Amaan Qureshi
500f4326d5 feat: add the ability to specify a custom decode function 2024-10-31 22:51:40 -04:00
Amaan Qureshi
8d68980aa8 feat(lib): add ts_query_cursor_exec_with_options
Currently, this allows users to pass in a callback that should be
invoked to check whether or not to halt query execution
2024-10-31 21:58:35 -04:00
Amaan Qureshi
26b89da9bb feat(lib): add ts_parser_parse_with_options
Currently, this allows users to pass in a callback that should be
invoked to check whether or not to halt parsing
2024-10-31 21:58:35 -04:00