Max Brunsfeld
a2d760e426
Ensure nodes are aliased consistently within syntax error nodes
...
Co-Authored-By: Rick Winfrey <rewinfrey@github.com>
2020-10-27 15:46:09 -07:00
Max Brunsfeld
8bb8e9b8b3
Initialize TSLanguage fields in order of their declaration
...
This makes parser.c valid under the C++20 standard
2020-10-15 07:20:12 -07:00
Max Brunsfeld
ffd3bdc4c1
Escape ? in C string literals
...
Fixes #714
2020-09-23 13:06:06 -07:00
Max Brunsfeld
b5a9adb555
Allow queries to match on supertypes
...
Co-authored-by: Ayman Nadeem <aymannadeem@github.com>
2020-09-21 12:34:48 -07:00
Max Brunsfeld
ff488f89c9
Make the --prev-abi flag work w/ the newest abi change
2020-09-08 10:58:20 -07:00
Max Brunsfeld
2eb04094f8
Handle aliased parent nodes in query analysis
2020-08-21 14:12:04 -07:00
Max Brunsfeld
4c2f36a07b
Mark steps as definite on query construction
...
* Add a ts_query_pattern_is_definite API, just for debugging this
* Store state_count on TSLanguage structs, to allow for scanning parse tables
2020-06-25 15:06:27 -07:00
Max Brunsfeld
ec870e9e66
Avoid extracting helpers for char sets that are only used once
2020-05-26 16:37:45 -07:00
Max Brunsfeld
911fb7f1b2
Extract helper functions to reduce the code size of the lexer function ( #626 )
...
* Extract helper functions to reduce code size of ts_lex
* Name char set helper functions based on token name
2020-05-26 13:39:11 -07:00
Max Brunsfeld
b66d149b74
Fix inconsistent whitespace after '{' in generated parser
2020-05-13 15:56:49 -07:00
Max Brunsfeld
cdc973866f
Fix build-wasm command on latest emscripten
2020-05-12 15:42:11 -07:00
Riccardo Schirone
780e9cecc9
Do not use multiple unnamed structs inside of unions
2020-04-29 20:42:45 +02:00
Max Brunsfeld
a003e5f6bd
generate: Avoid duplicate string tokens in unique symbol map
2020-03-20 11:35:11 -07:00
Alyssa Verkade
0e689657b7
Add a language linkage declaration to parsers
...
Previously, in order to compile a `tree-sitter` grammar that contained
c++ source in the parser (ie the `scanner.cc` file), you would have to
compile the `parser.c` file separately from the c++ files. For example,
in rust this would result in a `build.rs` close to the following:
```
extern crate cc;
fn main() {
let dir: PathBuf = ["tree-sitter-ruby", "src"].iter().collect();
cc::Build::new()
.include(&dir)
.cpp(true)
.file(dir.join("scanner.cc"))
// NOTE: must have a name that differs from the c static lib
.compile("tree-sitter-ruby-scanner");
cc::Build::new()
.include(&dir)
.file(dir.join("parser.c"))
// NOTE: must have a name that differs from the c++ static lib
.compile("tree-sitter-ruby-parser");
}
```
This was necessary at the time for the following grammars: `ruby`,
`php`, `python`, `embedded-template`, `html`, `cpp`, `ocaml`,
`bash`, `agda`, and `haskell`.
To solve this, we specify an `extern "C"` language linkage declaration
to the functions that must be linked against to compile a parser with the
scanner, making parsers linkable against c++ source.
On all major compilers (gcc, clang, and msvc) this should be the only
change needed due to the combination of clang and gcc both supporting
designated initialization for years and msvc 2019 adopting designated
initializers as a part of the C++20 conformance push.
Subsequently, for rust projects, the necessary `build.rs` would become
(which also brings these parsers into sync with the current docs):
```
extern crate cc;
fn main() {
let dir: PathBuf = ["tree-sitter-ruby", "src"].iter().collect();
cc::Build::new()
.include(&dir)
.cpp(true)
.file(dir.join("scanner.cc"))
.file(dir.join("parser.c"))
.compile("tree-sitter-ruby");
}
```
2020-02-18 19:46:59 -08:00
Max Brunsfeld
8dd68c360a
Fix logic for generating unique symbol map
...
Previously, this didn't correctly handle the case where *multiple*
symbols were all simply-aliased to the same *other* symbol.
Refs #500
2020-01-27 12:06:48 -08:00
Max Brunsfeld
fc19312913
Fix node-types bugs involving aliases and external tokens
2019-12-12 10:06:18 -08:00
Max Brunsfeld
a5a9000e29
generate: Ensure that field_map_slices array is long enough
2019-12-09 11:46:32 -08:00
Max Brunsfeld
7032dae4f6
Include alias symbols in unique symbol map
2019-12-06 12:11:09 -08:00
Max Brunsfeld
56c620c005
Store a mapping to ensure no two symbols map to the same metadata
2019-12-05 17:21:46 -08:00
Max Brunsfeld
5767bbc806
Avoid generating C char literals with control characters
...
Fixes #487
2019-11-13 10:54:34 -08:00
Max Brunsfeld
d765332c61
Don't rely on new eof ABI in parsers unless --next-abi is passed
2019-10-31 14:32:50 -07:00
Max Brunsfeld
d3b7caa565
Add a TSLexer.eof() API, use it in generated parsers
2019-10-31 14:11:52 -07:00
Max Brunsfeld
fcaabea0cf
Allow non-terminal extras
2019-10-21 16:08:59 -07:00
Max Brunsfeld
69ab405325
In next ABI, group symbols by action in small parse state table
...
This is a more compact representation because in most states, many
symbols share the same actions.
2019-08-30 20:29:55 -07:00
Max Brunsfeld
8037607583
Only generate the new parse table format if --next-abi flag is used
2019-08-29 17:37:33 -07:00
Max Brunsfeld
82ff542d3b
Appease MSVC by avoiding empty arrays
2019-08-29 17:31:44 -07:00
Max Brunsfeld
09a2755399
Store parse states with few lookahead symbols in a more compact way
2019-08-29 15:52:23 -07:00
Max Brunsfeld
48a883c1d4
Move external token state id computation out of render module
2019-08-29 15:48:22 -07:00
Max Brunsfeld
2430733ee8
Avoid iterating hashmaps in places where order matters
2019-08-29 15:26:05 -07:00
Max Brunsfeld
56ce4e5d50
Upgrade rsass, remove hashbrown
2019-08-13 10:08:58 -07:00
Max Brunsfeld
5f369a5870
Fix another empty array literal for MSVC compatibility
2019-08-12 15:13:41 -07:00
Max Brunsfeld
13c0aa7dbb
Avoid empty initializer list for ts_alias_sequences
...
Fixes a bug introduced in 68b089b41e
2019-08-12 14:11:40 -07:00
Max Brunsfeld
68b089b41e
cli: Fix generation of parsers with fields but no aliases
...
Fixes #419
2019-08-11 09:22:30 -07:00
Max Brunsfeld
5b38ff5f78
Loosen lex state equality check to catch some spurious duplicates
2019-06-20 09:57:38 -07:00
Max Brunsfeld
e4873191d6
Refactor generated lex function to use fewer instructions per state
2019-06-20 09:57:38 -07:00
Max Brunsfeld
5035e194ff
Merge branch 'master' into node-fields
2019-03-26 11:58:21 -07:00
Max Brunsfeld
5a59f19b69
Use explicit syntax for functions with no parameters
2019-03-21 16:06:06 -07:00
Max Brunsfeld
56309a1c28
Generate node-fields.json file
2019-02-12 11:06:18 -08:00
Max Brunsfeld
79d90f0d3e
Restore naming of alias sequence lengths
...
Fields aren't stored in sequences now, so the max length
is back to being just for aliases.
2019-02-08 16:14:18 -08:00
Max Brunsfeld
d8a2c0dda2
Use a separate type for storing field map headers
2019-02-08 16:06:29 -08:00
Max Brunsfeld
1d1674811c
Fully implement ts_node_child_by_field_id
2019-02-08 15:16:56 -08:00
Max Brunsfeld
18a13b457d
Get basic field API working
2019-02-08 15:16:56 -08:00
Max Brunsfeld
108ca989ea
Start work on including child refs in generated parsers
2019-02-08 15:16:56 -08:00
Max Brunsfeld
4badd7cc40
Disable compiler optimizations for lex functions in more cases
...
* Reduce the lexer state count threshold from 500 to 300
* Disable optimizations on clang and gcc in addition to MSVC
Optimizations in these source files don't seem to make any impact on
parsing performance, but they slow down compile time substantially.
2019-02-06 11:50:37 -08:00
Max Brunsfeld
ed195de8b6
rustfmt
2019-01-17 17:16:04 -08:00
Max Brunsfeld
19b2addcc4
Fix bug in symbol enum code generation
2019-01-14 14:08:07 -08:00
Max Brunsfeld
2e009f7177
Avoid writing empty initializer list for alias sequences
2019-01-12 21:57:34 -08:00
Max Brunsfeld
545e840a08
Remove stray single quotes in symbol name strings
2019-01-12 21:42:31 -08:00
Max Brunsfeld
c76a155174
Fix escaping of characters in C strings
2019-01-11 17:43:27 -08:00
Max Brunsfeld
6592fdd24c
Fix parser generation error messages
2019-01-11 17:26:45 -08:00