Commit graph

229 commits

Author SHA1 Message Date
Paul Gey
cf69a2c94c Use IndexMap and FxHash for some hot hash maps 2021-08-08 21:45:43 +02:00
Max Brunsfeld
c512a0eed7
Merge pull request #1194 from ahlinc/fix/1032
Close #1032 - fix all weirdness in the generated Cargo.toml
2021-06-29 16:48:23 -07:00
Andrew Hlynskyi
f22d62393b fix(cli): actual Rust binding version in generated Cargo.toml 2021-06-30 00:36:11 +03:00
Andrew Hlynskyi
d3527109a8 Updating of binding.gyp should depend on its content instead of bindings/node folder 2021-06-23 02:42:48 +03:00
Andrew Hlynskyi
22d63338a2 Use double quoted patterns for more precise pattern matching in the binding.gyp files 2021-06-23 02:41:30 +03:00
Andrew Hlynskyi
86b8137457 Add create_path_else fn to handle creation or modification 2021-06-23 02:40:32 +03:00
Andrew Hlynskyi
797c7668c1 feat(cli): Independant language binding files generation 2021-06-23 02:39:38 +03:00
Andrew Hlynskyi
4578e58794 fix(cli): close #1032 - fix repository template url generation in cargo.toml 2021-06-23 01:02:29 +03:00
Douglas Creager
d2d01e77e3 cli: Use anyhow and thiserror for errors
This patch updates the CLI to use anyhow and thiserror for error
management.  The main feature that our custom `Error` type was providing
was a _list_ of messages, which would allow us to annotate "lower-level"
errors with more contextual information.  This is exactly what's
provided by anyhow's `Context` trait.

(This is setup work for a future PR that will pull the `config` and
`loader` modules out into separate crates; by using `anyhow` we wouldn't
have to deal with a circular dependency between with the new crates.)
2021-06-09 16:17:23 -04:00
Andrew Hlynskyi
3c0152a331 chore(fmt): Apply 'cargo fmt' to the whole code base 2021-05-19 23:21:43 +03:00
Markus F.X.J. Oberhumer
cc519b3121 cli: Improve const-correctness of the generated parsers (part 2 of 2).
This is a follow-up to my previous commit 1badd131f9 .

I've made this an extra patch as it requires a minor
API change in <tree_sitter/parser.h>.

This commit moves the remaining generated tables into
the read-only segment.

Before:
  $ for f in bash c cpp go html java javascript jsdoc json php python ruby rust; do \
       gcc -o $f.o -O2 -Ilib/include -c test/fixtures/grammars/$f/src/parser.c; \
    done
  $ size --totals *.o
      text    data     bss     dec     hex filename
   5353477   24472       0 5377949  520f9d (TOTALS)

After:
  $ for f in bash c cpp go html java javascript jsdoc json php python ruby rust; do \
       gcc -o $f.o -O2 -Ilib/include -c test/fixtures/grammars/$f/src/parser.c; \
    done
  $ size --totals *.o
   5378147       0       0 5378147  521063 (TOTALS)
2021-05-19 12:49:57 +02:00
Andrew Hlynskyi
b856f7e1bd Remove unneeded dead_code annotations 2021-04-30 06:55:00 +03:00
Markus F.X.J. Oberhumer
1badd131f9 cli: Improve const-correctness of the generated parsers.
This moves most of the generated tables from the data segment into
the text segment (read-only memory) so that it can be shared between
different processes.

As a bonus side effect we can also remove all casts in the generated parsers.

Before:
  size --totals target/scratch/*.so
      text    data     bss     dec     hex filename
    853623 4684560    2160 5540343  5489f7 (TOTALS)

After:
  size --totals target/scratch/*.so
      text    data     bss     dec     hex filename
   5472086   68616     480 5541182  548d3e (TOTALS)
2021-04-27 09:22:18 +02:00
Andrew Hlynskyi
7aa538dd97 fix(cli): use dashed language name in generated package.json and Cargo.toml files 2021-04-22 16:29:48 +03:00
Andrew Hlynskyi
9416f975d3 fix(cli): set actual cli version in generated package.json 2021-04-22 16:29:48 +03:00
an-kumar
aabe6100d0
Update generated Cargo.toml's tree-sitter dependency
tree-sitter 0.19.0 bumped the language version from 12 to 13. `npm install tree-sitter-cli` gets a recent version of tree-sitter, which generates languages with language version 13. However, the Cargo.toml generated from `tree-sitter generate` still has a an old tree-sitter as a dependency. This causes the rust bindings to not work out of the box, as the tree-sitter library expects language version 12.

It would be nice to add a test for this in CI.  `tree-sitter generate` already creates a test for the rust binding, and that test fails out of the box due to the language mismatch.
2021-04-09 10:59:51 -07:00
Max Brunsfeld
c3eb5daa31 Include has_preceding_inherited_fields in Item's hash impl 2021-03-27 10:08:24 -07:00
Max Brunsfeld
57036b4f8a Extract lexer helper functions for all large char sets
No need to restrict it to char sets used in multiple places.
This is important because the helper functions are now implemented
more efficiently than the inline comparisons (using a binary search).
2021-03-11 11:48:48 -08:00
Andrew Hlynskyi
a331607f4e dsl.js: Reuse sym() in RuleBuilder 2021-03-10 23:06:53 +02:00
Max Brunsfeld
9e50befcf8 For node-types.json, process supertypes in a stable order 2021-03-08 12:02:01 -08:00
Max Brunsfeld
8e894ff3f1 Add --no-bindings flag to generate subcommand 2021-03-08 12:01:45 -08:00
Max Brunsfeld
7300249d20 Fix incorrect merging of states with different inherited fields
Co-authored-by: Douglas Creager <dcreager@dcreager.net>
2021-03-05 14:49:28 -08:00
Max Brunsfeld
e20aff9a9c Fix templates for rust binding files 2021-03-04 14:22:31 -08:00
Max Brunsfeld
e12093e8df Fix regression introduced in CharacterSet optimization 2021-03-04 13:50:27 -08:00
Max Brunsfeld
dd4cba2625 Allow symbols to be used in precedence lists 2021-03-03 13:11:05 -08:00
Max Brunsfeld
592fd8678d Organize TSLanguage fields
Due to the breaking ABI change in #943, this is our chance
to reorder the fields in a more logical way.
2021-03-01 10:27:22 -08:00
Max Brunsfeld
d56f9ebe4e Re-enable --prev-abi flag to generate command 2021-02-26 14:51:01 -08:00
Max Brunsfeld
075bf2bd5c In generate, create rust bindings
Also, migrate node binding files into the same 'bindings' folder.
2021-02-26 13:24:21 -08:00
Max Brunsfeld
c1639cc456 Add production_id_count field to Language objects
I think this is the last additional field that's needed so
that every array member of TSLanguage has a length that
can be calculated at runtime.
2021-02-25 16:32:05 -08:00
Max Brunsfeld
d8a235faa1 Add further static validation of named precedences 2021-02-25 11:54:21 -08:00
Max Brunsfeld
344797c110 Implement named precedence comparison 2021-02-24 16:02:56 -08:00
Max Brunsfeld
d40f118370 Generalize precedence datatype to include strings
Right now, the strings are not used in comparisons, but they
are passed through the grammar processing pipeline, and are
available to the parse table construction algorithm.

This also cleans up a confusing aspect of the parse table
construction, in which precedences and associativities were
temporarily stored in the parse table data structure itself.
2021-02-23 20:48:39 -08:00
Max Brunsfeld
2f28a35e1b Handle unicode property escapes inside bracketed char classes
Refs #906
2021-02-18 22:27:44 -08:00
Max Brunsfeld
29bc26ecd5 Fix test failure after non-terminal extras change 2021-02-18 15:43:01 -08:00
Max Brunsfeld
86a891fa63 Fix bugs in parser generation for non-terminal extras
Previously, we attempted to completely separate the parse states
for item sets with non-terminal extras from the parse states
for other rules. But there was not a complete separation.

It actually isn't necessary to separate the parse states in this way.
The only special behavior for parse states with non-terminal extra rules
is what happens at the *end* of the rule: these parse states need to
perform an unconditional reduction.

Luckily, it's possible to distinguish these *non-terminal extra ending*
states from other states just based on their normal structure, with
no additional state.
2021-02-18 14:14:22 -08:00
Max Brunsfeld
b46d51f224 Add a unit test for all unicode character escape forms 2021-02-17 17:49:01 -08:00
Max Brunsfeld
5b630054c6 Handle negated unicode property escapes in regexes
Refs #380
2021-02-17 17:22:33 -08:00
Max Brunsfeld
6ae04051e7 Tweak whitespace in generated character set functions 2021-02-17 16:32:49 -08:00
Max Brunsfeld
9d9eb2234f
Merge pull request #906 from tree-sitter/unicode-property-escapes
Handle simple unicode property escapes in regexes
2021-02-17 16:14:42 -08:00
Max Brunsfeld
dad8546776 Generate more compact code for character set binary search 2021-02-17 13:52:23 -08:00
Max Brunsfeld
6132a10b1c Use binary search in generated character set functions 2021-02-17 13:08:56 -08:00
Max Brunsfeld
f5a4c14dbe Add some doc comments to CharacterSet 2021-02-16 21:37:52 -08:00
Max Brunsfeld
2b0de9dfec Fix small bugs in conflict reporting
* Negative precedence values were not displayed
* Rule names were repeated in resolution suggestions
2021-02-01 13:30:06 -08:00
Max Brunsfeld
e3ba701344 Start work on handling unicode property escapes in regexes 2021-01-29 16:37:45 -08:00
Max Brunsfeld
38444ea7f9
Merge pull request #904 from tree-sitter/character-set-ranges
Represent CharacterSet internally as a vector of ranges
2021-01-29 13:35:48 -08:00
Andrew Hlynskyi
2b9e5f6c4b Fix hiding problems in ./build/Debug/tree_sitter_*_binding
In debug building modules also may happen errors and a current implementation
completely hides them, so errors like 'undefined symbol' can't be
easily identified due to wrong traceback and error message.
2021-01-29 15:54:10 +02:00
Max Brunsfeld
ab78ab3f9b Represent CharacterSet internally as a vector of ranges 2021-01-28 16:10:39 -08:00
Max Brunsfeld
026231e93d Merge branch 'master' into HEAD 2020-12-03 09:44:33 -08:00
Max Brunsfeld
3497f34dd7 Fix parser-generation bugs introduced in #782 2020-11-02 13:43:28 -08:00
Arthur Baars
d62e7f7d75 Add test case with extra_symbols 2020-10-30 10:58:41 +01:00