Commit graph

5 commits

Author SHA1 Message Date
Matthew Krupcale
ee9a3c0ebb lib: remove utf8proc dependency (#436)
* Remove dependency on utf8proc

This removes the only external dependency on utf8proc for UTF-8 decoding. It does so by implementing its own UTF-8 decoder. This decoder is both faster and has a simpler API.

 * .gitmodules: remove utf8proc submodule
 * docs/section-2-using-parsers.md: remove requirement for utf8proc submodule
 * docs/section-6-contributing.md: likewise
 * lib/Cargo.toml: remove utf8proc subdirectory package include
 * lib/README.md: remove utf8proc subdirectory description
 * lib/binding_rust/build.rs: remove utf8proc compiler include directory
 * lib/src/lexer.c: remove utf8proc dependencies and types
 * lib/src/lib.c: remove utf8proc dependency
 * lib/src/unicode.h: define types for Unicode decoders
 * lib/src/utf16.{c,h}: implement more readable UTF-16 decoder
 * lib/src/utf8.{c,h}: implement fast UTF-8 decoder
 * lib/utf8proc: remove utf8proc submodule directory
 * script/build-lib: remove utf8proc compiler include directory
 * script/build-wasm: likewise

* Optimize ts_lexer__get_lookahead.

Try to favor non-failure code path and assign lookahead values directly to lexer

 * lib/src/lexer.c: optimize for non-failure code path

* Fix some compiler errors

 * lib/src/lexer.c: cast from signed to unsigned for decode_next result
 * lib/src/utf16.c: fix non-constant initializers for older compilers

* Remove some missed remnants of utf8proc

 * docs/section-2-using-parsers.md: only two include paths necessary now
 * lib/src/lib.c: no need to define UTF8PROC_STATIC

* Use ICU's utf8 and utf16 decoding routines

* Remove unnecessary casts when calling icu macros

* Check buffer length before attempting to decode a unicode character

* Use new unicode function when parsing Queries

Co-Authored-By: Matthew Krupcale <mkrupcale@matthewkrupcale.com>

* Mark libicu files as vendored for GitHub's stats
2019-10-14 11:18:39 -07:00
Max Brunsfeld
9a82bd9d83 Set up code to publish web bindings to npm 2019-05-07 13:11:04 -07:00
Max Brunsfeld
8edb6927d0 Update docs after Rust conversion 2019-02-05 11:34:01 -08:00
Max Brunsfeld
50281637d7 binding: Make parse methods more convenient
* Rename parse_str to parse and make it polymorphic.
* Rename parse_utf8 to parse_with, since it is now the callback-based
  version of parse
* Add a parse_utf16 method analogous to parse
* Rename existing parse_utf16 method to parse_utf16_with

This brings in the changes from tree-sitter/rust-tree-sitter#5
2019-02-05 10:59:33 -08:00
Max Brunsfeld
97ca3bc2d1 Add rust tree-sitter runtime binding in lib directory 2019-01-04 17:16:34 -08:00