Previously, in order to compile a `tree-sitter` grammar that contained
c++ source in the parser (ie the `scanner.cc` file), you would have to
compile the `parser.c` file separately from the c++ files. For example,
in rust this would result in a `build.rs` close to the following:
```
extern crate cc;
fn main() {
let dir: PathBuf = ["tree-sitter-ruby", "src"].iter().collect();
cc::Build::new()
.include(&dir)
.cpp(true)
.file(dir.join("scanner.cc"))
// NOTE: must have a name that differs from the c static lib
.compile("tree-sitter-ruby-scanner");
cc::Build::new()
.include(&dir)
.file(dir.join("parser.c"))
// NOTE: must have a name that differs from the c++ static lib
.compile("tree-sitter-ruby-parser");
}
```
This was necessary at the time for the following grammars: `ruby`,
`php`, `python`, `embedded-template`, `html`, `cpp`, `ocaml`,
`bash`, `agda`, and `haskell`.
To solve this, we specify an `extern "C"` language linkage declaration
to the functions that must be linked against to compile a parser with the
scanner, making parsers linkable against c++ source.
On all major compilers (gcc, clang, and msvc) this should be the only
change needed due to the combination of clang and gcc both supporting
designated initialization for years and msvc 2019 adopting designated
initializers as a part of the C++20 conformance push.
Subsequently, for rust projects, the necessary `build.rs` would become
(which also brings these parsers into sync with the current docs):
```
extern crate cc;
fn main() {
let dir: PathBuf = ["tree-sitter-ruby", "src"].iter().collect();
cc::Build::new()
.include(&dir)
.cpp(true)
.file(dir.join("scanner.cc"))
.file(dir.join("parser.c"))
.compile("tree-sitter-ruby");
}
```
* Reduce the lexer state count threshold from 500 to 300
* Disable optimizations on clang and gcc in addition to MSVC
Optimizations in these source files don't seem to make any impact on
parsing performance, but they slow down compile time substantially.