The `Emoji` property alias is already present, but the actual property
is not available since it lives in a new file. This adds that file to
the `generate-unicode-categories-json`.
The `emoji-data` file follows the same format as the ones we already
consume in `generate-unicode-categories-json`, so adding emoji support
is fairly easy. his, grammars would need to hard-code a set of
unicode ranges in their own regex. The Javascript library `emoji-regex`
cannot be used because of #451.
For unclear reasons, the characters #, *, and 0-9 are marked as
`Emoji=Yes` by `emoji-data.txt`. Because of this, a grammar that wishes
to use emojis is likely to want to exclude those characters. For that
reason, this change also adds support for binary operations in regexes,
e.g. `[\p{Emoji}&&[^#*0-9]]`.
Lastly (and perhaps controversially), this change introduces new
variables available at grammar compile time, for the major, minor, and
patch versions of the tree-sitter CLI used to compile the grammar. This
will allow grammars to conditionally adopt these new regex features
while remaining backward compatible with older versions of the CLI.
Without this part of the change, grammar authors who do not precompile
and check-in their `grammar.json` would need to wait for downstream
systems to adopt a newer tree-sitter CLI version before they could begin
to use these features.
For some ABI changes, we may need to make changes to the parser.h in order
to restore a previous binary format, but for the current range of supported
ABI versions (13 + 14), the current parser.h is fine.
Refs #1599
Fixes
```
warning: In file included from src\lib.c:12:
warning: src/./parser.c:1781:28: warning: 'fdopen' is deprecated: The POSIX name for this item is deprecated. Instead, use the ISO C and C++ conformant name: _fdopen. See online help for details. [-Wdeprecated-declarations]
warning: self->dot_graph_file = fdopen(fd, "a");
warning: ^
warning: C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt\stdio.h:2431:28: note: 'fdopen' has been explicitly marked deprecated here
warning: _Check_return_ _CRT_NONSTDC_DEPRECATE(_fdopen) _ACRTIMP FILE* __cdecl fdopen(_In_ int _FileHandle, _In_z_ char const* _Format);
warning: ^
warning: C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt\corecrt.h:414:50: note: expanded from macro '_CRT_NONSTDC_DEPRECATE'
warning: #define _CRT_NONSTDC_DEPRECATE(_NewName) _CRT_DEPRECATE_TEXT( \
warning: ^
warning: C:\Program Files (x86)\Microsoft Visual Studio\2019\Preview\VC\Tools\MSVC\14.29.30133\include\vcruntime.h:310:47: note: expanded from macro '_CRT_DEPRECATE_TEXT'
warning: #define _CRT_DEPRECATE_TEXT(_Text) __declspec(deprecated(_Text))
```
Due to an oversight in #1589, I added `primary_field_ids` into the
`TSLanguage` struct in a place that wasn't the end. This is not actually
backwards compatible and causes downstream failures :(