Commit graph

600 commits

Author SHA1 Message Date
Max Brunsfeld
01df16ca9f lib: 0.20.8 2022-06-27 15:07:40 -07:00
Max Brunsfeld
3ac36b0cbe Handle backslashes in token names when printing DOT debug graphs 2022-06-25 17:13:11 -07:00
Max Brunsfeld
5aa2f4dc8c Log when ignoring an empty external token after an error 2022-06-24 19:07:27 -07:00
Max Brunsfeld
d223a81b50 Allow empty external tokens during err recovery if they change the scanner's state 2022-06-24 15:58:13 -07:00
Max Brunsfeld
c0e1991f6b 🎨 ts_parser__lex 2022-06-24 14:24:21 -07:00
Max Brunsfeld
ca902065cb Fix bug when stack versions merge after reducing a non-terminal extra 2022-06-24 14:24:21 -07:00
Max Brunsfeld
db91399ea7 lib: 0.20.7 2022-06-22 16:03:21 -07:00
Max Brunsfeld
58b719541b Fix failure to match queries with wildcard at root with range restrictions 2022-06-22 15:54:06 -07:00
rhysd
08899428f3 Add C APIs as document aliases 2022-05-30 21:36:11 +09:00
Max Brunsfeld
465ceead0f
Merge pull request #1677 from siegel/master
Fixed warning/error when compiling with `clang -Os`.
2022-04-03 15:30:06 -07:00
Aleksei Bavshin
fe33599f46
lib: fix incorrect int ptr cast on big-endian architectures
`*usize` -> `*u32` conversion on 64-bit big-endian machine takes high
halfword of the value. As a consequence, any result returned via
`count` is unexpectedly shifted left:

    u32   = 00 00 00 01             // 1
    usize = 00 00 00 01 00 00 00 00 // 4294967296

Fixes following test failure:
```
$ cargo test -- tests::corpus_test
<...>
running 13 tests
memory allocation of 206158430208 bytes failed
error: test failed, to rerun pass '--lib'
```
2022-03-23 00:47:01 -07:00
Rich Siegel
150eb2966b Fixed warning/error when compiling with clang -Os.
DISCUSSION:

When compiling with `-Os` for "smallest, fastest", an error is reported in `parser.c`:

```
/Users/siegel/git/tree-sitter/lib/src/./parser.c:1368:10: error: unused variable 'did_merge' [-Werror,-Wunused-variable]
    bool did_merge = ts_stack_merge(self->stack, version, previous_version_count);
         ^
1 error generated.
```

This is because with `NDEBUG` set,  `assert(e)` collapses to `(void)0`,
which in turn means that `did_merge` does not actually get consumed.
This seems to get caught when compiling with `-Os`, but not otherwise.

Compiler version:
```
Apple clang version 13.0.0 (clang-1300.0.29.30)
Target: arm64-apple-darwin21.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
```
2022-03-04 18:00:16 -05:00
Max Brunsfeld
1b2e90f647 libs: 0.20.6 2022-03-02 20:50:29 -08:00
Max Brunsfeld
8decec3774 Properly incorporate lookahead bytes when recovering via missing token 2022-03-02 17:12:25 -08:00
Max Brunsfeld
fcbef45899 libs: 0.20.5 2022-03-02 14:43:16 -08:00
Max Brunsfeld
0fb864c1a0 Retain information about the lexer's lookahead for the token where an error was detected 2022-02-22 09:45:26 -08:00
Max Brunsfeld
0bdd9b640c Store the lookahead subtree of paused stack versions, not just the lookahead symbol 2022-02-22 09:45:26 -08:00
Max Brunsfeld
af00782dfd Add files needed for using clangd 2022-02-22 09:44:50 -08:00
Max Brunsfeld
d08f1af15c 🎨 2022-02-22 09:43:57 -08:00
Max Brunsfeld
5ef4ef4e2e lib: 0.20.4 2022-02-04 13:13:21 -08:00
Max Brunsfeld
cb4317ba8e Change goto_first_child_for_{byte,point} to compare nodes' ranges inclusively
Co-Authored-By: Antonio Scandurra <me@as-cii.com>
2022-02-04 12:38:33 -08:00
Max Brunsfeld
fab8540508 libs: 0.20.3 2022-01-21 16:35:37 -08:00
Max Brunsfeld
584b55df8d Delete unused code, tweak whitespace 2022-01-19 16:54:57 -08:00
Max Brunsfeld
fce23d63b3
Merge pull request #1602 from the-mikedavis/md-ignore-future-matches-for-non-local-patterns
prevent future captures for `#is-not? local` matches
2022-01-19 16:40:30 -08:00
Michael Davis
716ef24578
remove unfinished queries from 'ts_query_cursor_remove_match' 2022-01-18 17:01:07 -06:00
Alex Pinkus
858ea5782b Fix back compat by moving primary_field_ids to the end
Due to an oversight in #1589, I added `primary_field_ids` into the
`TSLanguage` struct in a place that wasn't the end. This is not actually
backwards compatible and causes downstream failures :(
2022-01-17 17:23:02 -08:00
Max Brunsfeld
516fd6f6de Add --abi flag to generate command, generate version 13 by default 2022-01-17 14:50:47 -08:00
Max Brunsfeld
aaf4572727
Merge pull request #1589 from alex-pinkus/deduplicate-core-ids
Ignore duplicate states when initializing subgraphs in `ts_query__analyze_patterns`
2022-01-17 13:54:31 -08:00
Alex Pinkus
eaf9b170f1 Don't start with duplicate states in ts_query__analyze_patterns
This change exposes a new `primary_state_ids` field on the `TSLanguage`
struct, and populates it by tracking the first encountered state with a
given `core_id`. (For posterity: the initial change just exposed
`core_id` and deduplicated within `ts_analyze_query`).

With this `primary_state_ids` field in place, the
`ts_query__analyze_patterns` function only needs to populate its
subgraphs with starting states that are _primary_, since non-primary
states behave identically to primary ones. This leads to large savings
across the board, since most states are not primary.
2022-01-16 11:17:47 -08:00
Max Brunsfeld
e96ee19901
Merge pull request #1504 from hendrikvanantwerpen/expose-capture-suffixes
Expose capture suffixes in queries
2022-01-14 12:11:25 -08:00
Max Brunsfeld
9df064c9fe
Merge pull request #1581 from tlaplus-community/get-codepoint-column
`get_column` now counts codepoints instead of bytes
2022-01-12 10:49:44 -08:00
Hendrik van Antwerpen
9dace8f9fe Add explicit breaks to prevent fall through errors 2022-01-11 19:08:32 +01:00
Hendrik van Antwerpen
c76d8ee076 Represent quantifiers using bytes instead of ints 2022-01-11 18:41:33 +01:00
Hendrik van Antwerpen
70aee901ac Reduce error handling logic 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
acd3d32c36 Remove reference to strings from quantifier-only function 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
99e74fa0f5 Move quantifier addition out of loop and drop conditional 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
93db863729 Remove obsolete FIXMEs 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
ec9b00e5c6 Handle multiple top-level alternations correctly 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
8b28f3a8c4 Shorten quantifier operations by using early returns 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
e338726cde Prefix globally visible TSquantifier values 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
36f2440369 Complete comment 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
ae2ac3c0db Initialize variable to silence compiler warnings 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
a1a241b013 Expose quantifiers per pattern, instead of merging for all patterns in a query 2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
1d513bcf67 Rewrite quantifier oeprations
- Simplify control flow by having a single return at the end of the function.
- Follow enum order for case order.
2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
1f1a449c76 Improve capture quantifier computation
Compute quantifiers in a bottom-up manner, which allows more precise
results for alternations, where the quantifiers are now precisly joined.
2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
9bac066330 Deal with quantifiers appearing on capture's enclosing patterns
- Use a proper enum type for quantifiers.
- Drop quantifiers from `TSQueryStep`, which was not used.
- Keep track of the captures introduced during a pattern parse, and
  apply the quantifier for the pattern to the captures that were
  introduced by the pattern or any sub patterns.
- Use 'quantifier' instead of 'suffix'.
2022-01-11 18:33:36 +01:00
Hendrik van Antwerpen
ae7869d1a6 Expose capture suffixes in queries 2022-01-11 18:33:36 +01:00
Andrew Helwer
5a6530a413 Added tests 2022-01-11 12:05:37 -05:00
Alex Pinkus
679a841183 Add pointer indirection to AnalysisStateSet
Profiling the `ts_query__analyze_patterns` function shows that it
spends a lot of time copying items in its various state sets. These
state sets are kept sorted, and the items are fairly large, so any time
that we insert new entries near the front of the array, a lot of calls
to memcpy must occur.

In advance of more sophisticated rework, one easy win is to hide the
large `AnalysisStateSet` objects behind pointers, so that the size of
each item in the list goes from 68 to 8 bytes, and add an object pool to
reuse allocations. This shows a significant performance improvement for
grammars that have a lot of states in them.
2022-01-10 20:07:14 -08:00
Andrew Helwer
80c34d62ab Fixed rust build, updated docs 2022-01-07 10:36:25 -05:00