From f6a943a1ad4483c87fc205c19648c559bc731e26 Mon Sep 17 00:00:00 2001 From: Amaan Qureshi Date: Mon, 20 Jan 2025 04:06:32 -0500 Subject: [PATCH] docs(web): update README and add CONTRIBUTING docs --- lib/binding_web/CONTRIBUTING.md | 142 ++++++++++++++++++++++++++++++++ lib/binding_web/README.md | 56 +++++++++---- 2 files changed, 182 insertions(+), 16 deletions(-) create mode 100644 lib/binding_web/CONTRIBUTING.md diff --git a/lib/binding_web/CONTRIBUTING.md b/lib/binding_web/CONTRIBUTING.md new file mode 100644 index 00000000..7072abe0 --- /dev/null +++ b/lib/binding_web/CONTRIBUTING.md @@ -0,0 +1,142 @@ +# Contributing + +## Code of Conduct + +Contributors to Tree-sitter should abide by the [Contributor Covenant][covenant]. + +## Developing Web-tree-sitter + +### Prerequisites + +To make changes to Web-tree-sitter, you should have: + +1. A [Rust toolchain][rust], for running the xtasks necessary to build the library. +2. Node.js and NPM (or an equivalent package manager). +3. Either [Emscripten][emscripten], [Docker][docker], or [podman][podman] for +compiling the library to WASM. + +### Building + +Clone the repository: + +```sh +git clone https://github.com/tree-sitter/tree-sitter +cd tree-sitter/lib/binding_web +``` + +Install the necessary dependencies: + +```sh +npm install +``` + +Build the library: + +```sh +npm run build +``` + +Note that the build process requires a Rust toolchain to be installed. If you don't have one installed, you can install it +by visiting the [Rust website][rust] and following the instructions there. + +> [!NOTE] +> By default, the build process will emit an ES6 module. If you need a CommonJS module, export `CJS` to `true`, or just +> run `CJS=true npm run build`. + +> [!TIP] +> To build the library with debug information, you can run `npm run build:debug`. The `CJS` environment variable is still +> taken into account. + +### Putting it together + +#### The C side + +There are several components that come together to build the final JS and WASM files. First, we use `emscripten` in our +xtask located at `xtask/src/build_wasm.rs` from the root directory to compile the WASM files. This WASM module is output into the +local `lib` folder, and is used only in [`src/bindings.ts`][bindings.ts] to handle loading the WASM module. The C code that +is compiled into the WASM module is located in at [`lib/tree-sitter.c`][tree-sitter.c], and contains all the necessary +glue code to interact with the JS environment. If you need to update the imported functions from the tree-sitter library, +or anywhere else, you must update [`lib/exports.txt`][exports.txt]. Lastly, the type information for the WASM module is +located at [`lib/tree-sitter.d.ts`][tree-sitter.d.ts], and can be updated by running `cargo xtask build-wasm --emit-tsd` +from the root directory. + +#### The TypeScript side + +The TypeScript library is a higher level abstraction over the WASM module, and is located in `src`. This is where the +public API is defined, and where the WASM module is loaded and initialized. The TypeScript library is built into a single +ES6 (or CommonJS) module, and is output into the same directory as `package.json`. If you need to update the public API, +you can do so by editing the files in `src`. + +If you make changes to the library that require updating the type definitions, such as adding a new public API method, +you should run: + +```sh +npm run build:dts +``` + +This uses [`dts-buddy`][dts-buddy] to generate `web-tree-sitter.d.ts` from the public types in `src`. Additionally, a sourcemap +is generated for the `.d.ts` file, which enables `go-to definition` and other editor integrations to take you straight +to the TypeScript source code. + +This TypeScript code is then compiled into a single JavaScript file with `esbuild`. The build configuration for this can +be found in [`script/build.js`][build.js], but this shouldn't need to be updated. This step is responsible for emitting +the final JS and WASM files that are shipped with the library, as well as their sourcemaps. + +### Testing + +Before you can run the tests, you need to fetch and build some upstream grammars that are used for testing. +Run this in the root of the repository: + +```sh +cargo xtask fetch-fixtures +``` + +Optionally, to update the generated parser.c files: + +```sh +cargo xtask generate-fixtures +``` + +Then you can build the WASM modules: + +```sh +cargo xtask generate-fixtures --wasm +``` + +Now, you can run the tests. In the `lib/binding_web` directory, run: + +```sh +npm test +``` + +> [!NOTE] +> We use `vitest` to run the tests. If you want to run a specific test, you can use the `-t` flag to pass in a pattern. +> If you want to run a specific file, you can just pass the name of the file as is. For example, to run the `parser` tests +> in `test/parser.test.ts`, you can run `npm test parser`. To run tests that have the name `descendant` somewhere, run +> `npm test -- -t descendant`. +> +> For coverage information, you can run `npm test -- --coverage`. + +### Debugging + +You might have noticed that when you ran `npm build`, the build process generated a couple of [sourcemaps][sourcemap]: +`tree-sitter.js.map` and `tree-sitter.wasm.map`. These sourcemaps can be used to debug the library in the browser, and are +shipped with the library on both NPM and the GitHub releases. + +#### Tweaking the Emscripten build + +If you're trying to tweak the Emscripten build, or are trying to debug an issue, the code for this lies in `xtask/src/build_wasm.rs` +file mentioned earlier, namely in the `run_wasm` function. + +[bindings.ts]: src/bindings.ts +[build.js]: script/build.js +[covenant]: https://www.contributor-covenant.org/version/1/4/code-of-conduct +[docker]: https://www.docker.com +[dts-buddy]: https://github.com/Rich-Harris/dts-buddy +[emscripten]: https://emscripten.org +[exports.txt]: lib/exports.txt +[podman]: https://podman.io +[rust]: https://www.rust-lang.org/tools/install +[sourcemap]: https://developer.mozilla.org/en-US/docs/Glossary/Source_map +[tree-sitter.c]: lib/tree-sitter.c +[tree-sitter.d.ts]: lib/tree-sitter.d.ts diff --git a/lib/binding_web/README.md b/lib/binding_web/README.md index 73a7c631..3713df5e 100644 --- a/lib/binding_web/README.md +++ b/lib/binding_web/README.md @@ -7,9 +7,10 @@ WebAssembly bindings to the [Tree-sitter](https://github.com/tree-sitter/tree-sitter) parsing library. -### Setup +## Setup -You can download the `tree-sitter.js` and `tree-sitter.wasm` files from [the latest GitHub release](https://github.com/tree-sitter/tree-sitter/releases/latest) and load them using a standalone script: +You can download the `tree-sitter.js` and `tree-sitter.wasm` files from [the latest GitHub release][gh release] and load +them using a standalone script: ```html @@ -20,7 +21,7 @@ You can download the `tree-sitter.js` and `tree-sitter.wasm` files from [the lat ``` -You can also install [the `web-tree-sitter` module](https://www.npmjs.com/package/web-tree-sitter) from NPM and load it using a system like Webpack: +You can also install [the `web-tree-sitter` module][npm module] from NPM and load it using a system like Webpack: ```js const { Parser } = require('web-tree-sitter'); @@ -50,13 +51,22 @@ await Parser.init(); // the library is ready ``` -To install a debug version of the library, pass in `--debug` when running `npm install`: +To use the debug version of the library, replace your import of `web-tree-sitter` with `web-tree-sitter/debug`: -```sh -npm install web-tree-sitter --debug +```js +import { Parser } from 'web-tree-sitter/debug'; // or require('web-tree-sitter/debug') + +Parser.init().then(() => { /* the library is ready */ }); ``` -This will load the debug version of the `.wasm` file, which includes sourcemaps for both the JS and WASM files, debug symbols, and assertions. +This will load the debug version of the `.js` and `.wasm` file, which includes debug symbols and assertions. + +> [!NOTE] +> The `tree-sitter.js` file on GH releases is an ES6 module. If you are interested in using a pure CommonJS library, such +> as for Electron, you should note that on our NPM package, we use [conditional exports][cond export] to provide both the +> ES6 and CommonJS modules. If you've set up your project correctly, and need to use CommonJS, your package manager will +> automatically handle this for you. As of writing, we do not host a CommonJS version of the library on GH releases, and +> if you do not use the NPM registry, you'll have to build the library yourself. ### Basic Usage @@ -126,7 +136,8 @@ const newTree = parser.parse(newSourceCode, tree); ### Parsing Text From a Custom Data Structure -If your text is stored in a data structure other than a single string, you can parse it by supplying a callback to `parse` instead of a string: +If your text is stored in a data structure other than a single string, you can parse it by supplying a callback to `parse` +instead of a string: ```javascript const sourceLines = [ @@ -146,7 +157,8 @@ There are several options on how to get the `.wasm` files for the languages you #### From npmjs.com -The recommended way is to just install the package from npm. For example, to parse JavaScript, you can install the `tree-sitter-javascript` package: +The recommended way is to just install the package from npm. For example, to parse JavaScript, you can install the `tree-sitter-javascript` +package: ```sh npm install tree-sitter-javascript @@ -156,16 +168,19 @@ Then you can find the `.wasm` file in the `node_modules/tree-sitter-javascript` #### From GitHub -You can also download the `.wasm` files from GitHub releases, so long as the repository uses our reusable workflow to publish them. -For example, you can download the JavaScript `.wasm` file from the tree-sitter-javascript [releases page](https://github.com/tree-sitter/tree-sitter-javascript/releases/latest) +You can also download the `.wasm` files from GitHub releases, so long as the repository uses our reusable workflow to publish +them. +For example, you can download the JavaScript `.wasm` file from the tree-sitter-javascript [releases page][gh release js]. #### Generating `.wasm` files -You can also generate the `.wasm` file for your desired grammar. Shown below is an example of how to generate the `.wasm` file for the JavaScript grammar. +You can also generate the `.wasm` file for your desired grammar. Shown below is an example of how to generate the `.wasm` +file for the JavaScript grammar. -**IMPORTANT**: [emscripten](https://emscripten.org/docs/getting_started/downloads.html), [docker](https://www.docker.com/), or [podman](https://podman.io) need to be installed. +**IMPORTANT**: [Emscripten][emscripten], [Docker][docker], or [Podman][podman] need to be installed. -First install `tree-sitter-cli` and the tree-sitter language for which to generate `.wasm` (`tree-sitter-javascript` in this example): +First install `tree-sitter-cli`, and the tree-sitter language for which to generate `.wasm` +(`tree-sitter-javascript` in this example): ```sh npm install --save-dev tree-sitter-cli tree-sitter-javascript @@ -181,7 +196,8 @@ If everything is fine, file `tree-sitter-javascript.wasm` should be generated in ### Running .wasm in Node.js -Notice that executing `.wasm` files in node.js is considerably slower than running [node.js bindings](https://github.com/tree-sitter/node-tree-sitter). However could be useful for testing purposes: +Notice that executing `.wasm` files in Node.js is considerably slower than running [Node.js bindings][node bindings]. +However, this could be useful for testing purposes: ```javascript const Parser = require('web-tree-sitter'); @@ -229,7 +245,7 @@ For more information on the module options you can pass in, see the [emscripten #### "Can't resolve 'fs' in 'node_modules/web-tree-sitter" Most bundlers will notice that the `tree-sitter.js` file is attempting to import `fs`, i.e. node's file system library. -Since this doesn't exist in the browser, the bundlers will get confused. For webpack you can fix this by adding the +Since this doesn't exist in the browser, the bundlers will get confused. For Webpack, you can fix this by adding the following to your webpack config: ```javascript @@ -242,4 +258,12 @@ following to your webpack config: } ``` +[cond export]: https://nodejs.org/api/packages.html#conditional-exports +[docker]: https://www.docker.com +[emscripten]: https://emscripten.org [emscripten-module-options]: https://emscripten.org/docs/api_reference/module.html#affecting-execution +[gh release]: https://github.com/tree-sitter/tree-sitter/releases/latest +[gh release js]: https://github.com/tree-sitter/tree-sitter-javascript/releases/latest +[node bindings]: https://github.com/tree-sitter/node-tree-sitter +[npm module]: https://www.npmjs.com/package/web-tree-sitter +[podman]: https://podman.io