Commit graph

80 commits

Author SHA1 Message Date
bglgwyng
c19cce111f refactor: use metadata_for_symbol helper in node_types generation 2025-11-20 19:25:29 +09:00
bglgwyng
c5b70d3c5c fix: use is_named variable instead of hardcoded true in symbol_ids lookup 2025-11-20 17:43:45 +09:00
bglgwyng
6dcc1edff2 test: add parser and node-types.json compatibility tests for multiple grammars 2025-11-20 12:37:37 +09:00
bglgwyng
80b5bce27a fix: fix return value of get_node_types 2025-11-20 11:58:42 +09:00
bglgwyng
f1f11bde00 Merge branch 'master' into include-symbol_id-in-node-types-json 2025-11-19 21:53:39 +09:00
bglgwyng
48b2440b1e refactor: remove unused imports from generate module 2025-11-19 21:51:22 +09:00
bglgwyng
98acc93411 fix: pass unique_aliases to assign symbol_ids for aliases 2025-11-19 21:48:08 +09:00
bglgwyng
f4472c0140 refactor: change unique_aliases to store tuples with numeric symbol IDs 2025-11-19 13:19:47 +09:00
bglgwyng
9f3677dc10 refactor: remove unused alias ID generation code 2025-11-19 13:03:00 +09:00
bglgwyng
a496b8af43 refactor: change symbol_ids to store multiple IDs per node type 2025-11-18 22:57:07 +09:00
WillLillis
61c21aa408 refactor(generate)!: include path when available in IO errors 2025-11-14 11:28:00 +01:00
WillLillis
db2d221ae9 fix(generate): remove leftover imports of anyhow 2025-11-14 11:28:00 +01:00
bglgwyng
0ad40ec263 refactor: extract symbol ID generation into dedicated module 2025-11-09 18:58:27 +09:00
bglgwyng
e029188319 test: update node type tests to use actual symbol IDs 2025-11-09 17:29:30 +09:00
bglgwyng
ff4c91a614 fix: fix grammar conflicts in test cases for parsing table generation 2025-11-09 17:14:00 +09:00
bglgwyng
ab9b098aad refactor: extract grammar introspection into separate module 2025-11-09 16:19:49 +09:00
bglgwyng
04420e4b51 refactor: remove unused JSONOutput and GeneratedParser structs 2025-11-07 17:29:25 +09:00
bglgwyng
21c9f9ae4f refactor: change symbol_ids to store both string and numeric IDs
- Modified symbol_ids HashMap to store tuples of (String, u16) instead of just String
- Updated symbol ID generation to assign numeric IDs sequentially (0 for end symbol, then 1, 2, 3...)
- Changed all symbol_ids access patterns throughout codebase to use tuple destructuring (.0 for string, .1 for numeric)
- Updated node_types.json to use numeric u16 symbol_id instead of String
2025-11-07 17:29:05 +09:00
bglgwyng
8238c36f5f style: format 2025-11-07 16:47:08 +09:00
bglgwyng
4519e2b8cc feat: add symbol_id field to node type JSON output
- Added symbol_id as optional field in NodeTypeJSON struct for tracking grammar symbols
- Threaded symbol_ids HashMap through generate_node_types_json function to populate symbol IDs
- Updated all test assertions to include symbol_id: None for backward compatibility
2025-11-07 16:46:52 +09:00
bglgwyng
b7d85668fe refactor: extract grammar introspection into separate function
- Consolidated grammar processing logic into new `introspect_grammar` function
- Removed intermediate `GeneratedParser` and `JSONOutput` structs in favor of direct `GrammarIntrospection` struct
- Simplified code generation flow by separating grammar analysis from code rendering
2025-11-07 16:17:28 +09:00
bglgwyng
3b8a653167 refactor: extract symbol ID generation and helper functions
- Moved symbol ID generation logic out of renderer initialization into standalone function
- Extracted sanitize_identifier and metadata_for_symbol as reusable helper functions
- Symbol IDs now computed before rendering and passed to renderer constructor
2025-11-07 11:57:39 +09:00
Will Lillis
419a5a7305 fix(generate): don't short-circuit within extend_sorted 2025-11-03 01:22:29 -05:00
WillLillis
b8f52210f9 perf: reduce needless allocations 2025-10-30 18:24:42 +01:00
Christian Clason
6188010f53 build(deps): bump rquickjs to v0.10.0 2025-10-29 18:30:25 -04:00
Will Lillis
87d778a1c6 fix(rust): apply Self usage in struct definition lint 2025-10-24 17:50:28 -04:00
Will Lillis
e344837e35 fix(rust): minor cleanup in generate code 2025-10-24 17:50:28 -04:00
Will Lillis
b3bc7701cd refactor(generate): make AliasMap use BTreeMap over HashMap 2025-10-12 15:56:30 -04:00
Will Lillis
262f1782cc fix(generate): ensure deterministic iteration order for symbol aliases
while constructing node-types.json
2025-10-12 15:56:30 -04:00
WillLillis
00d172bf9f fix(generate): correct display of precedence for
`--report-states-for-rule`
2025-10-12 15:56:12 -04:00
Will Lillis
ae54350c76 fix(generate): Add missing fields to NodeInfoJson sorting
This ensures a deterministic ordering for node-types.json
2025-10-11 14:25:52 -04:00
Amaan Qureshi
5f7806f99e feat: add option to disable parse state optimizations 2025-09-26 02:40:53 -04:00
WillLillis
a9bce7c18a fix(generate): return error when generated grammar's state count exceeds
the maximum allowed value.

Co-authored-by: Amaan Qureshi <git@amaanq.com>
2025-09-25 22:29:04 -05:00
Amaan Qureshi
ce56465197 test(rust): prefer asserts to panics 2025-09-23 01:19:14 -04:00
Amaan Qureshi
b0cdab85fe refactor(rust): avoid panics where possible 2025-09-23 01:19:14 -04:00
Amaan Qureshi
c89e40f008 fix(generate): fix builds outside of crate workspace 2025-09-21 02:34:10 -04:00
ObserverOfTime
d13657c40c refactor(generate): use the logger
Co-authored-by: Amaan Qureshi <git@amaanq.com>
2025-09-21 01:53:22 -04:00
Amaan Qureshi
311585d304 refactor!: rename stage flag to emit 2025-09-20 22:35:23 -04:00
Will Lillis
46ea65c89b refactor: remove url dependency 2025-09-17 04:31:53 -04:00
Will Lillis
6a28a62369 test: add safety checks to ensure langauge version constants are kept in
sync

The generate crate defines the `LANGUAGE_VERSION` constant separately
from the TREE_SITTER_LANGUAGE_VERSION definition in `api.h`.
2025-09-17 02:58:31 -04:00
Amaan Qureshi
317e2e74c2 Revert "feat(generate): allow more characters for keywords"
This reverts commit 0269357c5a.
2025-09-17 02:19:29 -04:00
Amaan Qureshi
04cfee5664 build(rust): remove unused dependencies 2025-09-16 18:57:06 -04:00
Amaan Qureshi
57c6105897 fix(generate): remove warning message for CJS grammars 2025-09-16 16:42:17 -04:00
Christian Clason
339bad2de4 feat(generate): don't embed tree-sitter CLI version in parser
Problem: embedding the CLI version used to generate a parser triggers CI
failures on all grammars for every (patch) release of tree-sitter, even
if there are no actual parser changes.

Solution: do not embed the version; instead rely on whether the update
introduces actual (presumably desirable) changes in the parser to
indicate regeneration is necessary.
2025-09-16 19:21:34 +02:00
Will Lillis
31ff62445b fix(generate): assert there is a Nfa last state before retrieving it
Prevents unsigned subtraction wrapping antics in release builds
2025-09-16 03:51:13 -04:00
bbb651
9593737871 build(generate): remove tree-sitter dependency
It was only used to share two constants, and balloons its dependencies.

This also makes `generate_parser_for_grammar` work in wasm.
(Tested in `wasm32-wasip2` in wasmtime with the json grammar,
`wasm32-unknown-unknown` running in the same setup exited successfully
so I'm pretty confident it works as well)

Co-authored-by: Amaan Qureshi <contact@amaanq.com>
2025-09-16 03:48:30 -04:00
Amaan Qureshi
0269357c5a feat(generate): allow more characters for keywords 2025-09-16 03:01:56 -04:00
Amaan Qureshi
39a67eec61 feat: migrate to ESM 2025-09-16 02:24:11 -04:00
Amaan Qureshi
eedbec8f24 feat: remove the need of an external JS runtime for processing grammars 2025-09-16 02:24:11 -04:00
ObserverOfTime
56325d2a3b chore: copy license to all packages 2025-09-11 03:12:35 -04:00