docs: update badges; fix markdown lint complains
Linter config `.vscode/settings.json`:
```json
{
"[markdown]": {
"files.trimTrailingWhitespace": false,
},
"markdownlint.config": {
"default": true,
// "ul-style": {
// "style": "asterisk"
// },
"MD001": false,
"MD024": false,
"MD025": false,
"MD033": false,
"MD041": false,
"MD053": false,
},
}
```
2023-04-16 21:14:19 +03:00
# Web Tree-sitter
[![npmjs.com badge]][npmjs.com]
[npmjs.com]: https://www.npmjs.org/package/web-tree-sitter
[npmjs.com badge]: https://img.shields.io/npm/v/web-tree-sitter.svg?color=%23BF4A4A
2019-05-07 10:27:45 -07:00
2019-05-07 13:07:36 -07:00
WebAssembly bindings to the [Tree-sitter ](https://github.com/tree-sitter/tree-sitter ) parsing library.
2019-05-07 10:27:45 -07:00
2025-01-20 04:06:32 -05:00
## Setup
2019-05-07 10:27:45 -07:00
2025-02-08 13:06:44 -05:00
You can download the `web-tree-sitter.js` and `web-tree-sitter.wasm` files from [the latest GitHub release][gh release] and load
2025-01-20 04:06:32 -05:00
them using a standalone script:
2019-05-07 10:27:45 -07:00
```html
2025-02-08 13:06:44 -05:00
< script src = "/the/path/to/web-tree-sitter.js" > < / script >
2019-05-07 10:27:45 -07:00
< script >
2025-01-20 03:13:08 -05:00
const { Parser } = window.TreeSitter;
2019-05-07 10:27:45 -07:00
Parser.init().then(() => { /* the library is ready */ });
< / script >
```
2025-01-20 04:06:32 -05:00
You can also install [the `web-tree-sitter` module][npm module] from NPM and load it using a system like Webpack:
2019-05-07 10:27:45 -07:00
```js
2025-01-20 03:13:08 -05:00
const { Parser } = require('web-tree-sitter');
2019-05-07 10:27:45 -07:00
Parser.init().then(() => { /* the library is ready */ });
```
2023-12-21 09:44:44 +01:00
or Vite:
```js
2025-01-20 03:13:08 -05:00
import { Parser } from 'web-tree-sitter';
2023-12-21 09:44:44 +01:00
Parser.init().then(() => { /* the library is ready */ });
```
With Vite, you also need to make sure your server provides the `tree-sitter.wasm`
file to your `public` directory. You can do this automatically with a `postinstall`
[script ](https://docs.npmjs.com/cli/v10/using-npm/scripts ) in your `package.json` :
```js
"postinstall": "cp node_modules/web-tree-sitter/tree-sitter.wasm public"
```
You can also use this module with [deno ](https://deno.land/ ):
2024-02-11 09:39:38 +01:00
```js
2025-08-13 15:56:21 -04:00
import { Parser } from "npm:web-tree-sitter";
2024-02-11 09:39:38 +01:00
await Parser.init();
// the library is ready
```
2025-01-20 04:06:32 -05:00
To use the debug version of the library, replace your import of `web-tree-sitter` with `web-tree-sitter/debug` :
2025-01-13 01:48:42 -05:00
2025-01-20 04:06:32 -05:00
```js
import { Parser } from 'web-tree-sitter/debug'; // or require('web-tree-sitter/debug')
Parser.init().then(() => { /* the library is ready */ });
2025-01-13 01:48:42 -05:00
```
2025-01-20 04:06:32 -05:00
This will load the debug version of the `.js` and `.wasm` file, which includes debug symbols and assertions.
> [!NOTE]
2025-02-08 13:06:44 -05:00
> The `web-tree-sitter.js` file on GH releases is an ES6 module. If you are interested in using a pure CommonJS library, such
> as for Electron, you should use the `web-tree-sitter.cjs` file instead.
2025-01-13 01:48:42 -05:00
2019-05-07 13:07:36 -07:00
### Basic Usage
First, create a parser:
2019-05-07 10:27:45 -07:00
```js
2025-01-20 03:13:08 -05:00
const parser = new Parser();
2019-05-07 10:27:45 -07:00
```
2019-06-20 18:06:09 -03:00
Then assign a language to the parser. Tree-sitter languages are packaged as individual `.wasm` files (more on this below):
2019-05-07 10:27:45 -07:00
```js
2025-01-20 03:13:08 -05:00
const { Language } = require('web-tree-sitter');
const JavaScript = await Language.load('/path/to/tree-sitter-javascript.wasm');
2019-05-07 10:27:45 -07:00
parser.setLanguage(JavaScript);
```
Now you can parse source code:
```js
const sourceCode = 'let x = 1; console.log(x);';
const tree = parser.parse(sourceCode);
```
and inspect the syntax tree.
```javascript
console.log(tree.rootNode.toString());
// (program
// (lexical_declaration
// (variable_declarator (identifier) (number)))
// (expression_statement
// (call_expression
// (member_expression (identifier) (property_identifier))
// (arguments (identifier)))))
const callExpression = tree.rootNode.child(1).firstChild;
console.log(callExpression);
// { type: 'call_expression',
// startPosition: {row: 0, column: 16},
// endPosition: {row: 0, column: 30},
// startIndex: 0,
// endIndex: 30 }
```
### Editing
If your source code *changes* , you can update the syntax tree. This will take less time than the first parse.
```javascript
// Replace 'let' with 'const'
const newSourceCode = 'const x = 1; console.log(x);';
tree.edit({
startIndex: 0,
oldEndIndex: 3,
newEndIndex: 5,
startPosition: {row: 0, column: 0},
oldEndPosition: {row: 0, column: 3},
newEndPosition: {row: 0, column: 5},
});
const newTree = parser.parse(newSourceCode, tree);
```
### Parsing Text From a Custom Data Structure
2025-01-20 04:06:32 -05:00
If your text is stored in a data structure other than a single string, you can parse it by supplying a callback to `parse`
instead of a string:
2019-05-07 10:27:45 -07:00
```javascript
const sourceLines = [
'let x = 1;',
'console.log(x);'
];
const tree = parser.parse((index, position) => {
let line = sourceLines[position.row];
if (line) return line.slice(position.column);
});
```
2019-06-20 18:06:09 -03:00
2025-01-16 01:29:42 -05:00
### Getting the `.wasm` language files
2019-06-20 18:06:09 -03:00
2025-01-16 01:29:42 -05:00
There are several options on how to get the `.wasm` files for the languages you want to parse.
#### From npmjs.com
2025-01-20 04:06:32 -05:00
The recommended way is to just install the package from npm. For example, to parse JavaScript, you can install the `tree-sitter-javascript`
package:
2025-01-16 01:29:42 -05:00
```sh
npm install tree-sitter-javascript
```
Then you can find the `.wasm` file in the `node_modules/tree-sitter-javascript` directory.
#### From GitHub
2025-01-20 04:06:32 -05:00
You can also download the `.wasm` files from GitHub releases, so long as the repository uses our reusable workflow to publish
them.
For example, you can download the JavaScript `.wasm` file from the tree-sitter-javascript [releases page][gh release js].
2025-01-16 01:29:42 -05:00
#### Generating `.wasm` files
2025-01-20 04:06:32 -05:00
You can also generate the `.wasm` file for your desired grammar. Shown below is an example of how to generate the `.wasm`
file for the JavaScript grammar.
2019-06-20 18:06:09 -03:00
2025-01-20 04:06:32 -05:00
**IMPORTANT**: [Emscripten][emscripten], [Docker][docker], or [Podman][podman] need to be installed.
2019-06-20 18:06:09 -03:00
2025-01-20 04:06:32 -05:00
First install `tree-sitter-cli` , and the tree-sitter language for which to generate `.wasm`
(`tree-sitter-javascript` in this example):
2019-06-20 18:06:09 -03:00
```sh
npm install --save-dev tree-sitter-cli tree-sitter-javascript
```
2023-04-09 20:40:18 +03:00
Then just use tree-sitter cli tool to generate the `.wasm` .
2019-06-20 18:06:09 -03:00
```sh
2024-02-23 17:40:20 -05:00
npx tree-sitter build --wasm node_modules/tree-sitter-javascript
2019-06-20 18:06:09 -03:00
```
If everything is fine, file `tree-sitter-javascript.wasm` should be generated in current directory.
2025-12-26 19:51:01 +09:00
### WASM Version Compatibility
> [!IMPORTANT]
> WASM language files must be generated with a compatible version of `tree-sitter-cli`.
The WASM binary format includes ABI (Application Binary Interface) information that must match between `web-tree-sitter` and the generated `.wasm` files. Using incompatible versions will cause `Language.load()` to fail.
| web-tree-sitter | Compatible tree-sitter-cli |
|-----------------|---------------------------|
| 0.26.x | 0.26.x |
| 0.25.x | 0.20.x - 0.25.x |
| 0.24.x | 0.20.x - 0.24.x |
**If you're using pre-built WASM files** from third-party packages (e.g., `tree-sitter-wasms` ), ensure they were built with a compatible `tree-sitter-cli` version.
**Recommended**: Generate WASM files with the same major.minor version of `tree-sitter-cli` as your `web-tree-sitter` version:
```sh
# For web-tree-sitter@0.26.x, use tree-sitter-cli@0.26.x
npm install tree-sitter-cli@0 .26
npx tree-sitter build --wasm node_modules/tree-sitter-javascript
```
2025-01-16 01:29:42 -05:00
### Running .wasm in Node.js
2019-06-20 18:06:09 -03:00
2025-01-20 04:06:32 -05:00
Notice that executing `.wasm` files in Node.js is considerably slower than running [Node.js bindings][node bindings].
However, this could be useful for testing purposes:
2019-06-20 18:06:09 -03:00
```javascript
const Parser = require('web-tree-sitter');
(async () => {
await Parser.init();
const parser = new Parser();
const Lang = await Parser.Language.load('tree-sitter-javascript.wasm');
parser.setLanguage(Lang);
const tree = parser.parse('let x = 1;');
console.log(tree.rootNode.toString());
})();
```
2022-08-21 17:55:04 -04:00
2025-01-16 01:29:42 -05:00
### Running .wasm in browser
2022-08-21 17:55:04 -04:00
`web-tree-sitter` can run in the browser, but there are some common pitfalls.
2025-01-16 01:29:42 -05:00
#### Loading the .wasm file
2022-08-21 17:55:04 -04:00
2023-04-09 20:40:18 +03:00
`web-tree-sitter` needs to load the `tree-sitter.wasm` file. By default, it assumes that this file is available in the
2022-08-21 17:55:04 -04:00
same path as the JavaScript code. Therefore, if the code is being served from `http://localhost:3000/bundle.js` , then
2025-08-19 12:32:46 +03:00
the Wasm file should be at `http://localhost:3000/tree-sitter.wasm` .
2022-08-21 17:55:04 -04:00
2023-04-09 20:40:18 +03:00
For server side frameworks like NextJS, this can be tricky as pages are often served from a path such as
2025-08-19 12:32:46 +03:00
`http://localhost:3000/_next/static/chunks/pages/index.js` . The loader will therefore look for the Wasm file at
2023-04-09 20:40:18 +03:00
`http://localhost:3000/_next/static/chunks/pages/tree-sitter.wasm` . The solution is to pass a `locateFile` function in
2022-08-21 17:55:04 -04:00
the `moduleOptions` argument to `Parser.init()` :
```javascript
await Parser.init({
locateFile(scriptName: string, scriptDirectory: string) {
return scriptName;
},
});
```
2025-08-19 12:32:46 +03:00
`locateFile` takes in two parameters, `scriptName` , i.e. the Wasm file name, and `scriptDirectory` , i.e. the directory
where the loader expects the script to be. It returns the path where the loader will look for the Wasm file. In the NextJS
2022-08-21 17:55:04 -04:00
case, we want to return just the `scriptName` so that the loader will look at `http://localhost:3000/tree-sitter.wasm`
and not `http://localhost:3000/_next/static/chunks/pages/tree-sitter.wasm` .
2025-01-20 03:13:08 -05:00
For more information on the module options you can pass in, see the [emscripten documentation][emscripten-module-options].
2025-01-16 01:29:42 -05:00
#### "Can't resolve 'fs' in 'node_modules/web-tree-sitter"
2022-08-21 17:55:04 -04:00
2025-02-08 13:06:44 -05:00
Most bundlers will notice that the `web-tree-sitter.js` file is attempting to import `fs` , i.e. node's file system library.
2025-01-20 04:06:32 -05:00
Since this doesn't exist in the browser, the bundlers will get confused. For Webpack, you can fix this by adding the
2022-08-21 17:55:04 -04:00
following to your webpack config:
```javascript
{
resolve: {
fallback: {
fs: false
}
}
}
```
2025-01-20 03:13:08 -05:00
2025-01-20 04:06:32 -05:00
[docker]: https://www.docker.com
[emscripten]: https://emscripten.org
2025-01-20 03:13:08 -05:00
[emscripten-module-options]: https://emscripten.org/docs/api_reference/module.html#affecting -execution
2025-01-20 04:06:32 -05:00
[gh release]: https://github.com/tree-sitter/tree-sitter/releases/latest
[gh release js]: https://github.com/tree-sitter/tree-sitter-javascript/releases/latest
[node bindings]: https://github.com/tree-sitter/node-tree-sitter
[npm module]: https://www.npmjs.com/package/web-tree-sitter
[podman]: https://podman.io