`*usize` -> `*u32` conversion on 64-bit big-endian machine takes high
halfword of the value. As a consequence, any result returned via
`count` is unexpectedly shifted left:
u32 = 00 00 00 01 // 1
usize = 00 00 00 01 00 00 00 00 // 4294967296
Fixes following test failure:
```
$ cargo test -- tests::corpus_test
<...>
running 13 tests
memory allocation of 206158430208 bytes failed
error: test failed, to rerun pass '--lib'
```
DISCUSSION:
When compiling with `-Os` for "smallest, fastest", an error is reported in `parser.c`:
```
/Users/siegel/git/tree-sitter/lib/src/./parser.c:1368:10: error: unused variable 'did_merge' [-Werror,-Wunused-variable]
bool did_merge = ts_stack_merge(self->stack, version, previous_version_count);
^
1 error generated.
```
This is because with `NDEBUG` set, `assert(e)` collapses to `(void)0`,
which in turn means that `did_merge` does not actually get consumed.
This seems to get caught when compiling with `-Os`, but not otherwise.
Compiler version:
```
Apple clang version 13.0.0 (clang-1300.0.29.30)
Target: arm64-apple-darwin21.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
```
* Allow iterations to be specified via an env var
* Randomly decide the edit count, with a maximum
specified via an env var.
* Instead of separate env vars for starting seed + trial, just accept a seed
* Remove some noisy output
The `Emoji` property alias is already present, but the actual property
is not available since it lives in a new file. This adds that file to
the `generate-unicode-categories-json`.
The `emoji-data` file follows the same format as the ones we already
consume in `generate-unicode-categories-json`, so adding emoji support
is fairly easy. his, grammars would need to hard-code a set of
unicode ranges in their own regex. The Javascript library `emoji-regex`
cannot be used because of #451.
For unclear reasons, the characters #, *, and 0-9 are marked as
`Emoji=Yes` by `emoji-data.txt`. Because of this, a grammar that wishes
to use emojis is likely to want to exclude those characters. For that
reason, this change also adds support for binary operations in regexes,
e.g. `[\p{Emoji}&&[^#*0-9]]`.
Lastly (and perhaps controversially), this change introduces new
variables available at grammar compile time, for the major, minor, and
patch versions of the tree-sitter CLI used to compile the grammar. This
will allow grammars to conditionally adopt these new regex features
while remaining backward compatible with older versions of the CLI.
Without this part of the change, grammar authors who do not precompile
and check-in their `grammar.json` would need to wait for downstream
systems to adopt a newer tree-sitter CLI version before they could begin
to use these features.