Predicates/directives are documented to end in either `!` or `?`.
However, `query.c` allows them to be any valid identifier, and also
allows `?` or `!` characters anywhere inside an identifier.
This commit removes `?` and `!` from the list of valid identifier
characters, and asserts that predicates/directives only *end* in `?` or
`!`, respectively.
This commit is breaking because you can no longer do something like
`(#eq? @capture foo!bar)` (`foo!bar` must now be quoted).
**Problem:** After `ts_parser_parse_with_options()`, the parser options
are still stored in the parser object, meaning that a successive call to
`ts_parser_parse()` will actually behave like
`ts_parser_parse_with_options()`, which is not obvious and can have
unintended consequences.
**Solution:** Reset to empty options state after
`ts_parser_parse_with_options()`.
* Rename corpus test functions to allow easy filtering by language
* Use usize for seed argument
* Avoid retaining useless stack versions when reductions merge
We found this problem when debugging an infinite loop that happened
during error recovery when using the Zig grammar. The large number of
unnecessary paused stack versions were preventing the correct recovery
strategy from being tried.
* Fix leaked lookahead token when reduction results in a merged stack
* Enable running PHP tests in CI
* Fix possible infinite loop during error recovery at EOF
* Account for external scanner state changes when detecting changed ranges in subtrees
**Problem:** When resetting the parser during subtree balancing, an
error is thrown:
```
parser.c:2198: ts_parser_parse: Assertion `self->finished_tree.ptr' failed.
```
**Solution:** Reset `canceled_balancing` to false in
`ts_parser_reset()`.
Problem: Macros (re)defined in `endian.h` conflict with system headers
on FreeBSD (at least).
Solution: Rely on system `endian.h` on OpenBSD, FreeBSD, NetBSD, and
DragonFly
Ref. https://github.com/mikepb/endian.h/issues/4
This allows users to bail parsing if an error was *definitely* detected
using the progress callback, as all possible stack versions have a
non-zero error cost.
Co-authored-by: Amaan Qureshi <amaanq12@gmail.com>
Introduces a new function that takes in a supertype symbol and returns
all associated subtypes. Can be used by query.c to give better errors
for invalid subtypes, as well as downstream applications like the query
LSP to give better diagnostics.
When collecting captures, we were treating unfinished ones as definite
even if they had pending immediate steps that weren't yet satisfied. Now
we only mark a capture as definite if the pattern is guaranteed and
there are no pending immediate steps to check.
When a pattern appears under a wildcard parent (like "(_ (expr))"), we
were incorrectly marking it as infallible. The parent_pattern_guaranteed
flag only means the pattern will match after finding the right wildcard
parent, not that any wildcard parent will work.
Previously, when a pattern was marked as the last child in a query, its
alternatives weren't marked similarly, causing incorrect matching
behavior. Now, the `last_child` status is properly propagated through
all alternatives.