You can now specify `$` as the position to apply an edit, signifying the
end of the file. (That prevents you from having to calculate the size
of the file yourself.)
This patch adds the `tree-sitter-config` crate, which manages
tree-sitter's configuration file. This new setup allows different
components to define their own serializable configuration types, instead
of having to create a single monolithic configuration type. But the
configuration itself is still stored in a single JSON file.
Before, the default location for the configuration file was
`~/.tree-sitter/config.json`. This patch updates the default location
to follow the XDG Base Directory spec (or other relevant platform-
specific spec). So on Linux, for instance, the new default location is
`~/.config/tree-sitter/config.json`. We will look in the new location
_first_, and fall back on reading from the legacy location if we can't
find anything.
This patch adds a new `tree-sitter-loader` crate, which holds the CLI's
logic for finding and building local grammar definitions at runtime.
This allows other command-line tools to use this logic too!
With the change to anyhow in the previous commit, we stopped ignoring
BrokenPipe errors. Now we do again, not as a core part of our error
type, but as part of the `main` functions reaction to any error that
occurs.
This patch updates the CLI to use anyhow and thiserror for error
management. The main feature that our custom `Error` type was providing
was a _list_ of messages, which would allow us to annotate "lower-level"
errors with more contextual information. This is exactly what's
provided by anyhow's `Context` trait.
(This is setup work for a future PR that will pull the `config` and
`loader` modules out into separate crates; by using `anyhow` we wouldn't
have to deal with a circular dependency between with the new crates.)
The default is now a whopping 64K matches, which "should be enough for
everyone". You can use the new `ts_query_cursor_set_match_limit`
function to set this to a lower limit, such as the previous default of
32.
This function (and the similar `ts_tree_cursor_goto_first_child_for_byte`)
allows you to efficiently seek the tree cursor to a given position,
exploiting the tree's internal balancing, without having to visit
all of the preceding siblings of each node.
This restores the original signatures of the `set_byte_range` and
`set_point_range` functions. Now, the QueryCursor will properly report
matches that intersect, but are not fully contained by its range.
Co-Authored-By: Nathan Sobo <nathan@zed.dev>
Well, not completely unlimited — we're still using a 16-bit counter to
keep track of them. But we longer have a static maximum of 32 pending
matches when executing a query.
This is a follow-up to my previous commit 1badd131f9 .
I've made this an extra patch as it requires a minor
API change in <tree_sitter/parser.h>.
This commit moves the remaining generated tables into
the read-only segment.
Before:
$ for f in bash c cpp go html java javascript jsdoc json php python ruby rust; do \
gcc -o $f.o -O2 -Ilib/include -c test/fixtures/grammars/$f/src/parser.c; \
done
$ size --totals *.o
text data bss dec hex filename
5353477 24472 0 5377949 520f9d (TOTALS)
After:
$ for f in bash c cpp go html java javascript jsdoc json php python ruby rust; do \
gcc -o $f.o -O2 -Ilib/include -c test/fixtures/grammars/$f/src/parser.c; \
done
$ size --totals *.o
5378147 0 0 5378147 521063 (TOTALS)
This moves most of the generated tables from the data segment into
the text segment (read-only memory) so that it can be shared between
different processes.
As a bonus side effect we can also remove all casts in the generated parsers.
Before:
size --totals target/scratch/*.so
text data bss dec hex filename
853623 4684560 2160 5540343 5489f7 (TOTALS)
After:
size --totals target/scratch/*.so
text data bss dec hex filename
5472086 68616 480 5541182 548d3e (TOTALS)