This is a follow-up to my previous commit 1badd131f9 .
I've made this an extra patch as it requires a minor
API change in <tree_sitter/parser.h>.
This commit moves the remaining generated tables into
the read-only segment.
Before:
$ for f in bash c cpp go html java javascript jsdoc json php python ruby rust; do \
gcc -o $f.o -O2 -Ilib/include -c test/fixtures/grammars/$f/src/parser.c; \
done
$ size --totals *.o
text data bss dec hex filename
5353477 24472 0 5377949 520f9d (TOTALS)
After:
$ for f in bash c cpp go html java javascript jsdoc json php python ruby rust; do \
gcc -o $f.o -O2 -Ilib/include -c test/fixtures/grammars/$f/src/parser.c; \
done
$ size --totals *.o
5378147 0 0 5378147 521063 (TOTALS)
This bridges the gap between how the C API reports this for a query
cursor, but the wasm API defines the method on a query. Whenever you
call a query method that might exceed the match limit, we call the C API
function and transfer the result across the wasm boundary, storing the
result in the JavaScript wrapper class.
This moves most of the generated tables from the data segment into
the text segment (read-only memory) so that it can be shared between
different processes.
As a bonus side effect we can also remove all casts in the generated parsers.
Before:
size --totals target/scratch/*.so
text data bss dec hex filename
853623 4684560 2160 5540343 5489f7 (TOTALS)
After:
size --totals target/scratch/*.so
text data bss dec hex filename
5472086 68616 480 5541182 548d3e (TOTALS)
tree-sitter 0.19.0 bumped the language version from 12 to 13. `npm install tree-sitter-cli` gets a recent version of tree-sitter, which generates languages with language version 13. However, the Cargo.toml generated from `tree-sitter generate` still has a an old tree-sitter as a dependency. This causes the rust bindings to not work out of the box, as the tree-sitter library expects language version 12.
It would be nice to add a test for this in CI. `tree-sitter generate` already creates a test for the rust binding, and that test fails out of the box due to the language mismatch.
We now have an easier way to get at the language-specific configuration
in Rust, since we publish each language grammar as a crate with useful
accessor functions and globals.