An incremental parsing system for programming tools https://tree-sitter.github.io
Find a file
Matthew Krupcale ee9a3c0ebb lib: remove utf8proc dependency (#436)
* Remove dependency on utf8proc

This removes the only external dependency on utf8proc for UTF-8 decoding. It does so by implementing its own UTF-8 decoder. This decoder is both faster and has a simpler API.

 * .gitmodules: remove utf8proc submodule
 * docs/section-2-using-parsers.md: remove requirement for utf8proc submodule
 * docs/section-6-contributing.md: likewise
 * lib/Cargo.toml: remove utf8proc subdirectory package include
 * lib/README.md: remove utf8proc subdirectory description
 * lib/binding_rust/build.rs: remove utf8proc compiler include directory
 * lib/src/lexer.c: remove utf8proc dependencies and types
 * lib/src/lib.c: remove utf8proc dependency
 * lib/src/unicode.h: define types for Unicode decoders
 * lib/src/utf16.{c,h}: implement more readable UTF-16 decoder
 * lib/src/utf8.{c,h}: implement fast UTF-8 decoder
 * lib/utf8proc: remove utf8proc submodule directory
 * script/build-lib: remove utf8proc compiler include directory
 * script/build-wasm: likewise

* Optimize ts_lexer__get_lookahead.

Try to favor non-failure code path and assign lookahead values directly to lexer

 * lib/src/lexer.c: optimize for non-failure code path

* Fix some compiler errors

 * lib/src/lexer.c: cast from signed to unsigned for decode_next result
 * lib/src/utf16.c: fix non-constant initializers for older compilers

* Remove some missed remnants of utf8proc

 * docs/section-2-using-parsers.md: only two include paths necessary now
 * lib/src/lib.c: no need to define UTF8PROC_STATIC

* Use ICU's utf8 and utf16 decoding routines

* Remove unnecessary casts when calling icu macros

* Check buffer length before attempting to decode a unicode character

* Use new unicode function when parsing Queries

Co-Authored-By: Matthew Krupcale <mkrupcale@matthewkrupcale.com>

* Mark libicu files as vendored for GitHub's stats
2019-10-14 11:18:39 -07:00
cli node-types: Rework the approach to computing multiple and required 2019-10-11 13:59:02 -07:00
docs lib: remove utf8proc dependency (#436) 2019-10-14 11:18:39 -07:00
highlight highlight iterator: Return byte offset ranges instead of string slices 2019-09-04 17:29:31 -07:00
lib lib: remove utf8proc dependency (#436) 2019-10-14 11:18:39 -07:00
script lib: remove utf8proc dependency (#436) 2019-10-14 11:18:39 -07:00
test Handle named nodes aliased as anonymous nodes 2019-08-29 14:28:44 -07:00
.appveyor.yml Build and test wasm on CI 2019-04-26 14:38:13 -07:00
.gitattributes lib: remove utf8proc dependency (#436) 2019-10-14 11:18:39 -07:00
.gitignore Build and test wasm on CI 2019-04-26 14:38:13 -07:00
.gitmodules lib: remove utf8proc dependency (#436) 2019-10-14 11:18:39 -07:00
.travis.yml Don't include wasm library in the CLI binary on windows 2019-05-14 15:51:12 -07:00
Cargo.lock 0.15.10 2019-10-02 14:13:20 -07:00
Cargo.toml Move code into cli directory 2019-01-04 16:50:52 -08:00
LICENSE Add boilerplate 2018-05-17 14:46:29 -07:00
README.md Use https README docs site link 2019-04-30 13:00:27 -07:00

tree-sitter

Build Status Build status

Tree-sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited. Tree-sitter aims to be:

  • General enough to parse any programming language
  • Fast enough to parse on every keystroke in a text editor
  • Robust enough to provide useful results even in the presence of syntax errors
  • Dependency-free so that the runtime library (which is written in pure C) can be embedded in any application

Documentation