Commit graph

111 commits

Author SHA1 Message Date
Max Brunsfeld
63fa0f23f2 Include 2-character ranges in array-based state transitions 2024-04-12 16:40:04 -07:00
Max Brunsfeld
7ec40b0ab4 Implement single-char state transitions using a static array and for loop
This reduces compile time, compared to generating many individual if statements.
2024-04-12 14:40:11 -07:00
Max Brunsfeld
3210c7e21f Avoid using a large character set constant when it doesn't reduce code size 2024-04-12 12:01:23 -07:00
Max Brunsfeld
3498498449 Merge branch 'master' into simpler-large-char-set-code 2024-04-12 10:03:46 -07:00
Max Brunsfeld
15fe07a20e Clean up code generation for lexer state transitions 2024-04-12 09:02:33 -07:00
Amaan Qureshi
abc7910381 refactor(rust): misc fixes & tidying 2024-04-11 22:35:43 -04:00
Amaan Qureshi
5825e24d56 style: wrap comments 2024-04-11 22:35:43 -04:00
Amaan Qureshi
b35efa8f33 style: format imports 2024-04-11 22:35:43 -04:00
Max Brunsfeld
1f0707e1ac Fix clippy warnings 2024-04-11 16:29:59 -07:00
Max Brunsfeld
b8701fcf18 Check EOF when checking a large char set that contains the null character 2024-04-11 16:19:21 -07:00
Max Brunsfeld
be6e6d3708 Merge branch 'master' into simpler-large-char-set-code 2024-04-11 16:03:20 -07:00
Max Brunsfeld
18ea74ee12
Merge pull request #3280 from ObserverOfTime/reduce
refactor(parser): make REDUCE macro non-variadic
2024-04-11 15:39:28 -07:00
ObserverOfTime
63babea301 fix: proper function prototypes 2024-04-11 16:28:21 -04:00
ObserverOfTime
818cd8c291
refactor(parser): make REDUCE macro non-variadic 2024-04-11 20:47:08 +03:00
Sebastian Lackner
5dc62cc828
fix(cli): fix mismatched parenthesis when accounting for && 2024-04-11 09:01:56 -04:00
Max Brunsfeld
3d088888f5 Derive large character sets from lex states for individual tokens 2024-04-10 16:53:39 -07:00
Max Brunsfeld
be34bc9430 Identify large char sets for lexer using NFA transitions 2024-04-09 17:53:37 -07:00
Amaan Qureshi
a9172e0caa fix: add a semicolon after SKIP macros 2024-04-08 17:56:05 -04:00
Amaan Qureshi
abed43a169 chore: clippy fix 2024-04-08 17:56:05 -04:00
ObserverOfTime
78b6067a5d fix(parser): fix variadic macro 2024-04-02 03:18:05 -04:00
Max Brunsfeld
39be6972fe Use static arrays and a fixed binary search for large char set checks 2024-03-29 23:00:48 -07:00
ObserverOfTime
4bbaee2f56 fix(lib): allow hiding symbols 2024-03-17 07:21:06 -04:00
Amaan Qureshi
92675117a6 fix(generate): extern allocator functions for the template don't need to be "exported" 2024-03-05 11:19:06 -05:00
Amaan Qureshi
54a31069af fix: parsers should export the language function on windows 2024-03-05 11:19:06 -05:00
Amaan Qureshi
304f8b7c04 fix: don't use __declspec(dllexport) on windows 2024-03-04 13:23:06 -05:00
ObserverOfTime
b4b2d9cecc refactor: remove extern/const where possible 2024-02-29 01:50:04 -05:00
Amaan Qureshi
32c23b6c90 fix: wrap || comparison in parenthesis when && is used 2024-02-24 01:35:40 -05:00
Amaan Qureshi
b40839cd72 style: prefer turbofish syntax where possible 2024-02-19 16:00:50 -05:00
Amaan Qureshi
5ea0dbf77a chore: some more clippy lints 2024-02-13 03:33:07 -05:00
Amaan Qureshi
59be1edaa1
refactor: swap &Vec[T] with &[T] where appropriate 2024-02-07 02:50:31 -05:00
Amaan Qureshi
04ff704bca
chore(cli): apply clippy fixes 2024-02-04 04:18:48 -05:00
Andrew Hlynskyi
d56b51a11d
Revert "Alt #2454" 2023-11-29 11:20:05 +02:00
Andrew Hlynskyi
60779cc1ac fix(gen): parser.c should include parser.h relatively 2023-08-26 20:57:08 +03:00
Andrew Hlynskyi
b3fef28a10 chore(gen): add parser.c enum names to be better discoverable 2023-08-25 19:11:42 +03:00
Andrew Hlynskyi
fbfa58edc8 chore(gen): move external scanner stuff closer to the end of parser.c 2023-08-25 19:11:42 +03:00
Andrew Hlynskyi
683fe442e4 fix(gen): cycle between aliases and anonymous symbols
An example of an error cycle in a `parser.c`:

```
static const TSSymbol ts_symbol_map[] = {
  ...
  [anon_sym_RBRACE] = anon_sym_RBRACE2,
  [anon_sym_RBRACE2] = anon_sym_RBRACE,
  ...
};
```
2023-08-23 16:51:05 +03:00
Amaan Qureshi
e0434327d0
fix(render): only output SPACE for strings that are just a space 2023-08-16 13:44:44 -04:00
Amaan Qureshi
0b1b0d2fb7
fix: replace & sanitize more characters 2023-08-13 19:29:37 -04:00
Andrew Hlynskyi
4a007259fc Fix warning from #2454 in more clear way 2023-08-10 03:59:34 +03:00
Amaan Qureshi
b8fe5fe21b fix: do not allow eof to advance states if the new state is the same state 2023-08-02 10:47:27 +01:00
Max Brunsfeld
4b93326898 Don't generate primary states array if it will be unused due to abi version setting 2022-03-02 14:57:59 -08:00
Alex Pinkus
858ea5782b Fix back compat by moving primary_field_ids to the end
Due to an oversight in #1589, I added `primary_field_ids` into the
`TSLanguage` struct in a place that wasn't the end. This is not actually
backwards compatible and causes downstream failures :(
2022-01-17 17:23:02 -08:00
Max Brunsfeld
516fd6f6de Add --abi flag to generate command, generate version 13 by default 2022-01-17 14:50:47 -08:00
Alex Pinkus
eaf9b170f1 Don't start with duplicate states in ts_query__analyze_patterns
This change exposes a new `primary_state_ids` field on the `TSLanguage`
struct, and populates it by tracking the first encountered state with a
given `core_id`. (For posterity: the initial change just exposed
`core_id` and deduplicated within `ts_analyze_query`).

With this `primary_state_ids` field in place, the
`ts_query__analyze_patterns` function only needs to populate its
subgraphs with starting states that are _primary_, since non-primary
states behave identically to primary ones. This leads to large savings
across the board, since most states are not primary.
2022-01-16 11:17:47 -08:00
Paul Gey
965e3c9e5e Generator::add_parse_table: Store entries in hash map
This avoids a quadratic behaviour due to repeatedly using `find` on a
growing `Vec`.
2021-08-08 21:45:43 +02:00
Andrew Hlynskyi
3c0152a331 chore(fmt): Apply 'cargo fmt' to the whole code base 2021-05-19 23:21:43 +03:00
Markus F.X.J. Oberhumer
cc519b3121 cli: Improve const-correctness of the generated parsers (part 2 of 2).
This is a follow-up to my previous commit 1badd131f9 .

I've made this an extra patch as it requires a minor
API change in <tree_sitter/parser.h>.

This commit moves the remaining generated tables into
the read-only segment.

Before:
  $ for f in bash c cpp go html java javascript jsdoc json php python ruby rust; do \
       gcc -o $f.o -O2 -Ilib/include -c test/fixtures/grammars/$f/src/parser.c; \
    done
  $ size --totals *.o
      text    data     bss     dec     hex filename
   5353477   24472       0 5377949  520f9d (TOTALS)

After:
  $ for f in bash c cpp go html java javascript jsdoc json php python ruby rust; do \
       gcc -o $f.o -O2 -Ilib/include -c test/fixtures/grammars/$f/src/parser.c; \
    done
  $ size --totals *.o
   5378147       0       0 5378147  521063 (TOTALS)
2021-05-19 12:49:57 +02:00
Markus F.X.J. Oberhumer
1badd131f9 cli: Improve const-correctness of the generated parsers.
This moves most of the generated tables from the data segment into
the text segment (read-only memory) so that it can be shared between
different processes.

As a bonus side effect we can also remove all casts in the generated parsers.

Before:
  size --totals target/scratch/*.so
      text    data     bss     dec     hex filename
    853623 4684560    2160 5540343  5489f7 (TOTALS)

After:
  size --totals target/scratch/*.so
      text    data     bss     dec     hex filename
   5472086   68616     480 5541182  548d3e (TOTALS)
2021-04-27 09:22:18 +02:00
Max Brunsfeld
57036b4f8a Extract lexer helper functions for all large char sets
No need to restrict it to char sets used in multiple places.
This is important because the helper functions are now implemented
more efficiently than the inline comparisons (using a binary search).
2021-03-11 11:48:48 -08:00
Max Brunsfeld
592fd8678d Organize TSLanguage fields
Due to the breaking ABI change in #943, this is our chance
to reorder the fields in a more logical way.
2021-03-01 10:27:22 -08:00