tree-sitter/docs/section-2-using-parsers.md

---
title: Using Parsers
permalink: using-parsers
---

# Using Parsers

All of Tree-sitter's parsing functionality is exposed through C APIs. Applications written in higher-level languages can use Tree-sitter via binding libraries like  [node-tree-sitter](https://github.com/tree-sitter/node-tree-sitter) or [rust-tree-sitter](https://github.com/tree-sitter/tree-sitter/tree/master/lib/binding), which have their own documentation.

This document will describes the general concepts of how to use Tree-sitter, which should be relevant regardless of what language you're using. It also goes into some C-specific details that are useful if you're using the C API directly or are building a new binding to a different language.

## Building the Library

Building the library requires one git submodule: [`utf8proc`](https://github.com/JuliaStrings/utf8proc). Make sure that `utf8proc` is downloaded by running this command from the Tree-sitter directory:

```sh
git submodule update --init
```

To build the library on a POSIX system, run this script, which will create a static library called `libtree-sitter.a` in the Tree-sitter folder:

```sh
script/build-lib
```

Alternatively, you can use the library in a larger project by adding one source file to the project. This source file needs three directories to be in the include path when compiled:

**source file:**
* `tree-sitter/lib/src/lib.c`

**include directories:**
* `tree-sitter/lib/src`
* `tree-sitter/lib/include`
* `tree-sitter/lib/utf8proc`

## The Objects

There are four main types of objects involved when using Tree-sitter: languages, parsers, syntax trees, and syntax nodes. In C, these are called `TSParser`, `TSLanguage`, `TSTree`, and `TSNode`.
* A `TSLanguage` is an opaque object that defines how to parse a particular programming language. The code for each `TSLanguage` is generated by Tree-sitter. Many languages are already available in separate git repositories within the the [Tree-sitter GitHub organization](https://github.com/tree-sitter). See [the next section](./creating-parsers) for how to create new languages.
* A `TSParser` is a stateful object that can be assigned a `TSLanguage` and used to produce a `TSTree` based on some source code.
* A `TSTree` represents the syntax tree of an entire source code file. Its contains `TSNode` instances that indicate the structure of the source code. It can also be edited and used to produce a new `TSTree` in the event that the source code changes.
* A `TSNode` represents a single node in the syntax tree. It tracks its start and end positions in the source code, as well as its relation to other nodes like its parent, siblings and children.

## An Example Program

Here's an example of a simple C program that uses the Tree-sitter [JSON parser](https://github.com/tree-sitter/tree-sitter-json).

```c
// Filename - test-json-parser.c

#include <assert.h>
#include <string.h>
#include <stdio.h>
#include <tree_sitter/api.h>

// Declare the `tree_sitter_json` function, which is
// implemented by the `tree-sitter-json` library.
TSLanguage *tree_sitter_json();

int main() {
  // Create a parser.
  TSParser *parser = ts_parser_new();

  // Set the parser's language (JSON in this case).
  ts_parser_set_language(parser, tree_sitter_json());

  // Build a syntax tree based on source code stored in a string.
  const char *source_code = "[1, null]";
  TSTree *tree = ts_parser_parse_string(
    parser,
    NULL,
    source_code,
    strlen(source_code)
  );

  // Get the root node of the syntax tree.
  TSNode root_node = ts_tree_root_node(tree);

  // Get some child nodes.
  TSNode array_node = ts_node_named_child(root_node, 0);
  TSNode number_node = ts_node_named_child(array_node, 0);

  // Check that the nodes have the expected types.
  assert(strcmp(ts_node_type(root_node), "value") == 0);
  assert(strcmp(ts_node_type(array_node), "array") == 0);
  assert(strcmp(ts_node_type(number_node), "number") == 0);

  // Check that the nodes have the expected child counts.
  assert(ts_node_child_count(root_node) == 1);
  assert(ts_node_child_count(array_node) == 5);
  assert(ts_node_named_child_count(array_node) == 2);
  assert(ts_node_child_count(number_node) == 0);

  // Print the syntax tree as an S-expression.
  char *string = ts_node_string(root_node);
  printf("Syntax tree: %s\n", string);

  // Free all of the heap-allocated memory.
  free(string);
  ts_tree_delete(tree);
  ts_parser_delete(parser);
  return 0;
}
```

This program uses the Tree-sitter C API, which is declared in the header file `tree_sitter/api.h`, so we need to add the `tree_sitter/include` directory to the include path. We also need to link `libtree-sitter.a` into the binary. We compile the source code of the JSON language directly into the binary as well.

```sh
clang                                   \
  -I tree-sitter/include                \
  test-json-parser.c                    \
  tree-sitter-json/src/parser.c         \
  tree-sitter/libtree-sitter.a  \
  -o test-json-parser

./test-json-parser
```

## Providing the Source Code

In the example above, we parsed source code stored in a simple string using the `ts_parser_parse_string` function:

```c
TSTree *ts_parser_parse_string(
  TSParser *self,
  const TSTree *old_tree,
  const char *string,
  uint32_t length
);
```

You may want to parse source code that's stored in a custom data structure, like a [piece table](https://en.wikipedia.org/wiki/Piece_table) or a [rope](https://en.wikipedia.org/wiki/Rope_(data_structure)). In this case, you can use the more general `ts_parser_parse` function:

```c
TSTree *ts_parser_parse(
  TSParser *self,
  const TSTree *old_tree,
  TSInput input
);
```

The `TSInput` structure lets you to provide your own function for reading a chunk of text at a given byte offset and row/column position. The function can return text encoded in either UTF8 or UTF16. This interface allows you to efficiently parse text that is stored in your own data structure.

```c
typedef struct {
  void *payload;
  const char *(*read)(
    void *payload,
    uint32_t byte_offset,
    TSPoint position,
    uint32_t *bytes_read
  );
  TSInputEncoding encoding;
} TSInput;
```

## Syntax Nodes

Tree-sitter provides a [DOM](https://en.wikipedia.org/wiki/Document_Object_Model)-style interface for inspecting syntax trees. A syntax node's *type* is a string that indicates which grammar rule the node represents.

```c
const char *ts_node_type(TSNode);
```

Syntax nodes store their position in the source code both in terms of raw bytes and row/column coordinates:

```c
uint32_t ts_node_start_byte(TSNode);
uint32_t ts_node_end_byte(TSNode);

typedef struct {
  uint32_t row;
  uint32_t column;
} TSPoint;

TSPoint ts_node_start_point(TSNode);
TSPoint ts_node_end_point(TSNode);
```

## Retrieving Nodes

Every tree has a *root node*:

```c
TSNode ts_tree_root_node(const TSTree *);
```

Once you have a node, you can access the node's children:

```c
uint32_t ts_node_child_count(TSNode);
TSNode ts_node_child(TSNode, uint32_t);
```

You can also access its siblings and parent:

```c
TSNode ts_node_next_sibling(TSNode);
TSNode ts_node_prev_sibling(TSNode);
TSNode ts_node_parent(TSNode);
```

These methods may all return a *null node* to indicate, for example, that a node does not *have* a next sibling. You can check if a node is null:

```c
bool ts_node_is_null(TSNode);
```

## Named vs Anonymous Nodes

Tree-sitter produces [*concrete* syntax trees](https://en.wikipedia.org/wiki/Parse_tree) - trees that contain nodes for every individual token in the source code, including things like commas and parentheses. This is important for use-cases that deal with individual tokens, like [syntax highlighting](https://en.wikipedia.org/wiki/Syntax_highlighting). But some types of code analysis are easier to perform using an [*abstract* syntax tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree) - a tree in which the less important details have been removed. Tree-sitter's trees support these use cases by making a distinction between *named* and *anonymous* nodes.

Consider a grammar rule like this:

```js
if_statement: $ => seq(
  'if',
  '(',
  $._expression,
  ')',
  $._statement,
)
```

A syntax node representing an `if_statement` in this language would have 5 children: the condition expression, the body statement, as well as the `if`, `(`, and `)` tokens. The expression and the statement would be marked as *named* nodes, because they have been given explicit names in the grammar. But the `if`, `(`, and `)` nodes would *not* be named nodes, because they are represented in the grammar as simple strings.

You can check whether any given node is named:

```c
bool ts_node_is_named(TSNode);
```

When traversing the tree, you can also choose to skip over anonymous nodes by using the `_named_` variants of all of the methods described above:

```c
TSNode ts_node_named_child(TSNode, uint32_t);
uint32_t ts_node_named_child_count(TSNode);
TSNode ts_node_next_named_sibling(TSNode);
TSNode ts_node_prev_named_sibling(TSNode);
```

If you use this group of methods, the syntax tree functions much like an abstract syntax tree.

## Editing

In applications like text editors, you often need to re-parse a file after its source code has changed. Tree-sitter is designed to support this use case efficiently. There are two steps required. First, you must *edit* the syntax tree, which adjusts the ranges of its nodes so that they stay in sync with the code.

```c
typedef struct {
  uint32_t start_byte;
  uint32_t old_end_byte;
  uint32_t new_end_byte;
  TSPoint start_point;
  TSPoint old_end_point;
  TSPoint new_end_point;
} TSInputEdit;

void ts_tree_edit(TSTree *, const TSInputEdit *);
```

Then, you can call `ts_parser_parse` again, passing in the old tree. This will create a new tree that internally shares structure with the old tree.

When you edit a syntax tree, the positions of its nodes will change. If you have stored any `TSNode` instances outside of the `TSTree`, you must update their positions separately, using the same `TSInput` value, in order to update their cached positions.

```c
void ts_node_edit(TSNode *, const TSInputEdit *);
```

This `ts_node_edit` function is *only* needed in the case where you have retrieved `TSNode` instances *before* editing the tree, and then *after* editing the tree, you want to continue to use those specific node instances. Often, you'll just want to re-fetch nodes from the edited tree, in which case `ts_node_edit` is not needed.

## Multi-language Documents

Sometimes, different parts of a file may be written in different languages. For example, templating languages like [EJS](http://ejs.co) and [ERB](https://ruby-doc.org/stdlib-2.5.1/libdoc/erb/rdoc/ERB.html) allow you to generate HTML by writing a mixture of HTML and another language like JavaScript or Ruby.

Tree-sitter handles these types of documents by allowing you to create a syntax tree based on the text in certain *ranges* of a file.

```c
typedef struct {
  TSPoint start_point;
  TSPoint end_point;
  uint32_t start_byte;
  uint32_t end_byte;
} TSRange;

void ts_parser_set_included_ranges(
  TSParser *self,
  const TSRange *ranges,
  uint32_t range_count
);
```

For example, consider this ERB document:

```erb
<ul>
  <% people.each do |person| %>
    <li><%= person.name %></li>
  <% end %>
</ul>
```

Conceptually, it can be represented by three syntax trees with overlapping ranges: an ERB syntax tree, a Ruby syntax tree, and an HTML syntax tree. You could generate these syntax trees with the following code:

```c
#include <string.h>
#include <tree_sitter/api.h>

// These functions are each implemented in their own repo.
const TSLanguage *tree_sitter_embedded_template();
const TSLanguage *tree_sitter_html();
const TSLanguage *tree_sitter_ruby();

int main(int argc, const char **argv) {
  const char *text = argv[1];
  unsigned len = strlen(src);

  // Parse the entire text as ERB.
  TSParser *parser = ts_parser_new();
  ts_parser_set_language(parser, tree_sitter_embedded_template());
  TSTree *erb_tree = ts_parser_parse_string(parser, NULL, text, len);
  TSNode erb_root_node = ts_tree_root_node(erb_tree);

  // In the ERB syntax tree, find the ranges of the `content` nodes,
  // which represent the underlying HTML, and the `code` nodes, which
  // represent the interpolated Ruby.
  TSRange html_ranges[10];
  TSRange ruby_ranges[10];
  unsigned html_range_count = 0;
  unsigned ruby_range_count = 0;
  unsigned child_count = ts_node_child_count(erb_root_node);

  for (unsigned i = 0; i < child_count; i++) {
    TSNode node = ts_node_child(erb_root_node, i);
    if (strcmp(ts_node_type(node), "content") == 0) {
      html_ranges[html_range_count++] = (TSRange) {
        ts_node_start_point(node),
        ts_node_end_point(node),
        ts_node_start_byte(node),
        ts_node_end_byte(node),
      };
    } else {
      TSNode code_node = ts_node_named_child(node, 0);
      ruby_ranges[ruby_range_count++] = (TSRange) {
        ts_node_start_point(code_node),
        ts_node_end_point(code_node),
        ts_node_start_byte(code_node),
        ts_node_end_byte(code_node),
      };
    }
  }

  // Use the HTML ranges to parse the HTML.
  ts_parser_set_language(parser, tree_sitter_html());
  ts_parser_set_included_ranges(parser, html_ranges, html_range_count);
  TSTree *html_tree = ts_parser_parse_string(parser, NULL, text, len);
  TSNode html_root_node = ts_tree_root_node(html_tree);

  // Use the Ruby ranges to parse the Ruby.
  ts_parser_set_language(parser, tree_sitter_ruby());
  ts_parser_set_included_ranges(parser, ruby_ranges, ruby_range_count);
  TSTree *ruby_tree = ts_parser_parse_string(parser, NULL, text, len);
  TSNode ruby_root_node = ts_tree_root_node(ruby_tree);

  // Print all three trees.
  char *erb_sexp = ts_node_string(erb_root_node);
  char *html_sexp = ts_node_string(html_root_node);
  char *ruby_sexp = ts_node_string(ruby_root_node);
  printf("ERB: %s\n", erb_sexp);
  printf("HTML: %s\n", html_sexp);
  printf("Ruby: %s\n", ruby_sexp);
  return 0;
}
```

This API allows for great flexibility in how languages can be composed. Tree-sitter is not responsible for mediating the interactions between languages. Instead, you are free to do that using arbitrary application-specific logic.

## Concurrency

Tree-sitter supports multi-threaded use cases by making syntax trees very cheap to copy.

```c
TSTree *ts_tree_copy(const TSTree *);
```

Internally, copying a syntax tree just entails incrementing an atomic reference count. Conceptually, it provides you a new tree which you can freely query, edit, reparse, or delete on a new thread while continuing to use the original tree on a different thread. Note that individual `TSTree` instances are *not* thread safe; you must copy a tree if you want to use it on multiple threads simultaneously.
Start fleshing out docs site 2018-06-10 09:54:59 -07:00			`---`
			`title: Using Parsers`
			`permalink: using-parsers`
			`---`

			`# Using Parsers`

docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			`All of Tree-sitter's parsing functionality is exposed through C APIs. Applications written in higher-level languages can use Tree-sitter via binding libraries like [node-tree-sitter](https://github.com/tree-sitter/node-tree-sitter) or [rust-tree-sitter](https://github.com/tree-sitter/tree-sitter/tree/master/lib/binding), which have their own documentation.`
Add docs 2018-06-11 19:17:10 -07:00
Expand using parsers document 2018-08-13 18:04:10 -07:00			`This document will describes the general concepts of how to use Tree-sitter, which should be relevant regardless of what language you're using. It also goes into some C-specific details that are useful if you're using the C API directly or are building a new binding to a different language.`

docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			`## Building the Library`
Add a single-source file way of building the runtime library 2018-11-13 15:35:14 -08:00
docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			Building the library requires one git submodule: [`utf8proc`](https://github.com/JuliaStrings/utf8proc). Make sure that `utf8proc` is downloaded by running this command from the Tree-sitter directory:
Add a single-source file way of building the runtime library 2018-11-13 15:35:14 -08:00
			```sh
			`git submodule update --init`
			```

docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			To build the library on a POSIX system, run this script, which will create a static library called `libtree-sitter.a` in the Tree-sitter folder:
Add a single-source file way of building the runtime library 2018-11-13 15:35:14 -08:00
			```sh
docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			`script/build-lib`
Add a single-source file way of building the runtime library 2018-11-13 15:35:14 -08:00			```

docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			`Alternatively, you can use the library in a larger project by adding one source file to the project. This source file needs three directories to be in the include path when compiled:`
Add a single-source file way of building the runtime library 2018-11-13 15:35:14 -08:00
			`source file:`
docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			* `tree-sitter/lib/src/lib.c`
Add a single-source file way of building the runtime library 2018-11-13 15:35:14 -08:00
			`include directories:`
docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			* `tree-sitter/lib/src`
			* `tree-sitter/lib/include`
			* `tree-sitter/lib/utf8proc`
Add a single-source file way of building the runtime library 2018-11-13 15:35:14 -08:00
Expand using parsers section of the docs 2018-08-14 12:13:10 -07:00			`## The Objects`
Expand using parsers document 2018-08-13 18:04:10 -07:00
			There are four main types of objects involved when using Tree-sitter: languages, parsers, syntax trees, and syntax nodes. In C, these are called `TSParser`, `TSLanguage`, `TSTree`, and `TSNode`.
Fix small docs errors 2019-02-22 09:44:25 -08:00			* A `TSLanguage` is an opaque object that defines how to parse a particular programming language. The code for each `TSLanguage` is generated by Tree-sitter. Many languages are already available in separate git repositories within the the [Tree-sitter GitHub organization](https://github.com/tree-sitter). See [the next section](./creating-parsers) for how to create new languages.
Expand using parsers document 2018-08-13 18:04:10 -07:00			* A `TSParser` is a stateful object that can be assigned a `TSLanguage` and used to produce a `TSTree` based on some source code.
			* A `TSTree` represents the syntax tree of an entire source code file. Its contains `TSNode` instances that indicate the structure of the source code. It can also be edited and used to produce a new `TSTree` in the event that the source code changes.
			* A `TSNode` represents a single node in the syntax tree. It tracks its start and end positions in the source code, as well as its relation to other nodes like its parent, siblings and children.

			`## An Example Program`
Add docs 2018-06-11 19:17:10 -07:00
			`Here's an example of a simple C program that uses the Tree-sitter [JSON parser](https://github.com/tree-sitter/tree-sitter-json).`

			```c
			`// Filename - test-json-parser.c`

			`#include <assert.h>`
			`#include <string.h>`
			`#include <stdio.h>`
docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			`#include <tree_sitter/api.h>`
Add docs 2018-06-11 19:17:10 -07:00
Expand using parsers document 2018-08-13 18:04:10 -07:00			// Declare the `tree_sitter_json` function, which is
			// implemented by the `tree-sitter-json` library.
Add docs 2018-06-11 19:17:10 -07:00			`TSLanguage *tree_sitter_json();`

			`int main() {`
Expand using parsers document 2018-08-13 18:04:10 -07:00			`// Create a parser.`
Add docs 2018-06-11 19:17:10 -07:00			`TSParser *parser = ts_parser_new();`
Expand using parsers document 2018-08-13 18:04:10 -07:00
			`// Set the parser's language (JSON in this case).`
Add docs 2018-06-11 19:17:10 -07:00			`ts_parser_set_language(parser, tree_sitter_json());`

Expand using parsers document 2018-08-13 18:04:10 -07:00			`// Build a syntax tree based on source code stored in a string.`
Add docs 2018-06-11 19:17:10 -07:00			`const char *source_code = "[1, null]";`
Expand using parsers document 2018-08-13 18:04:10 -07:00			`TSTree *tree = ts_parser_parse_string(`
			`parser,`
			`NULL,`
			`source_code,`
			`strlen(source_code)`
			`);`

			`// Get the root node of the syntax tree.`
Add docs 2018-06-11 19:17:10 -07:00			`TSNode root_node = ts_tree_root_node(tree);`
Expand using parsers document 2018-08-13 18:04:10 -07:00
			`// Get some child nodes.`
Add docs 2018-06-11 19:17:10 -07:00			`TSNode array_node = ts_node_named_child(root_node, 0);`
			`TSNode number_node = ts_node_named_child(array_node, 0);`

			`// Check that the nodes have the expected types.`
Expand using parsers document 2018-08-13 18:04:10 -07:00			`assert(strcmp(ts_node_type(root_node), "value") == 0);`
			`assert(strcmp(ts_node_type(array_node), "array") == 0);`
			`assert(strcmp(ts_node_type(number_node), "number") == 0);`
Add docs 2018-06-11 19:17:10 -07:00
			`// Check that the nodes have the expected child counts.`
			`assert(ts_node_child_count(root_node) == 1);`
Fix assetion of an example program test-json-parser fails: test-json-parser.c:42: int main(): Assertion `ts_node_child_count(array_node) == 4' failed. `array_node` has five nodes: ([), (number), (,), (null), (]). 2018-10-25 02:05:23 +09:00			`assert(ts_node_child_count(array_node) == 5);`
Add docs 2018-06-11 19:17:10 -07:00			`assert(ts_node_named_child_count(array_node) == 2);`
			`assert(ts_node_child_count(number_node) == 0);`

			`// Print the syntax tree as an S-expression.`
			`char *string = ts_node_string(root_node);`
			`printf("Syntax tree: %s\n", string);`

Expand using parsers document 2018-08-13 18:04:10 -07:00			`// Free all of the heap-allocated memory.`
Add docs 2018-06-11 19:17:10 -07:00			`free(string);`
			`ts_tree_delete(tree);`
			`ts_parser_delete(parser);`
			`return 0;`
			`}`
			```

docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			This program uses the Tree-sitter C API, which is declared in the header file `tree_sitter/api.h`, so we need to add the `tree_sitter/include` directory to the include path. We also need to link `libtree-sitter.a` into the binary. We compile the source code of the JSON language directly into the binary as well.
Add docs 2018-06-11 19:17:10 -07:00
			```sh
			`clang \`
			`-I tree-sitter/include \`
			`test-json-parser.c \`
			`tree-sitter-json/src/parser.c \`
docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			`tree-sitter/libtree-sitter.a \`
Add docs 2018-06-11 19:17:10 -07:00			`-o test-json-parser`

			`./test-json-parser`
			```

Expand using parsers section of the docs 2018-08-14 12:13:10 -07:00			`## Providing the Source Code`
Add docs 2018-06-11 19:17:10 -07:00
Expand using parsers section of the docs 2018-08-14 12:13:10 -07:00			In the example above, we parsed source code stored in a simple string using the `ts_parser_parse_string` function:
Add docs 2018-06-11 19:17:10 -07:00
Expand using parsers section of the docs 2018-08-14 12:13:10 -07:00			```c
			`TSTree *ts_parser_parse_string(`
			`TSParser *self,`
			`const TSTree *old_tree,`
			`const char *string,`
			`uint32_t length`
			`);`
			```

			You may want to parse source code that's stored in a custom data structure, like a [piece table](https://en.wikipedia.org/wiki/Piece_table) or a [rope](https://en.wikipedia.org/wiki/Rope_(data_structure)). In this case, you can use the more general `ts_parser_parse` function:

			```c
			`TSTree *ts_parser_parse(`
			`TSParser *self,`
			`const TSTree *old_tree,`
			`TSInput input`
			`);`
			```

			The `TSInput` structure lets you to provide your own function for reading a chunk of text at a given byte offset and row/column position. The function can return text encoded in either UTF8 or UTF16. This interface allows you to efficiently parse text that is stored in your own data structure.

			```c
			`typedef struct {`
			`void *payload;`
			`const char (read)(`
			`void *payload,`
			`uint32_t byte_offset,`
			`TSPoint position,`
			`uint32_t *bytes_read`
			`);`
			`TSInputEncoding encoding;`
			`} TSInput;`
			```

			`## Syntax Nodes`

			`Tree-sitter provides a [DOM](https://en.wikipedia.org/wiki/Document_Object_Model)-style interface for inspecting syntax trees. A syntax node's type is a string that indicates which grammar rule the node represents.`

			```c
			`const char *ts_node_type(TSNode);`
			```

			`Syntax nodes store their position in the source code both in terms of raw bytes and row/column coordinates:`

			```c
			`uint32_t ts_node_start_byte(TSNode);`
			`uint32_t ts_node_end_byte(TSNode);`

			`typedef struct {`
			`uint32_t row;`
			`uint32_t column;`
			`} TSPoint;`

			`TSPoint ts_node_start_point(TSNode);`
			`TSPoint ts_node_end_point(TSNode);`
			```

			`## Retrieving Nodes`

			`Every tree has a root node:`

			```c
			`TSNode ts_tree_root_node(const TSTree *);`
			```

			`Once you have a node, you can access the node's children:`

			```c
			`uint32_t ts_node_child_count(TSNode);`
			`TSNode ts_node_child(TSNode, uint32_t);`
			```

			`You can also access its siblings and parent:`

			```c
			`TSNode ts_node_next_sibling(TSNode);`
			`TSNode ts_node_prev_sibling(TSNode);`
			`TSNode ts_node_parent(TSNode);`
			```

			`These methods may all return a null node to indicate, for example, that a node does not have a next sibling. You can check if a node is null:`

			```c
			`bool ts_node_is_null(TSNode);`
			```

			`## Named vs Anonymous Nodes`

			Tree-sitter produces [concrete syntax trees](https://en.wikipedia.org/wiki/Parse_tree) - trees that contain nodes for every individual token in the source code, including things like commas and parentheses. This is important for use-cases that deal with individual tokens, like [syntax highlighting](https://en.wikipedia.org/wiki/Syntax_highlighting). But some types of code analysis are easier to perform using an [abstract syntax tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree) - a tree in which the less important details have been removed. Tree-sitter's trees support these use cases by making a distinction between named and anonymous nodes.

			`Consider a grammar rule like this:`

			```js
			`if_statement: $ => seq(`
			`'if',`
			`'(',`
			`$._expression,`
			`')',`
			`$._statement,`
			`)`
			```

			A syntax node representing an `if_statement` in this language would have 5 children: the condition expression, the body statement, as well as the `if`, `(`, and `)` tokens. The expression and the statement would be marked as named nodes, because they have been given explicit names in the grammar. But the `if`, `(`, and `)` nodes would not be named nodes, because they are represented in the grammar as simple strings.

			`You can check whether any given node is named:`

			```c
			`bool ts_node_is_named(TSNode);`
			```

			When traversing the tree, you can also choose to skip over anonymous nodes by using the `_named_` variants of all of the methods described above:

			```c
			`TSNode ts_node_named_child(TSNode, uint32_t);`
			`uint32_t ts_node_named_child_count(TSNode);`
			`TSNode ts_node_next_named_sibling(TSNode);`
			`TSNode ts_node_prev_named_sibling(TSNode);`
			```

			`If you use this group of methods, the syntax tree functions much like an abstract syntax tree.`

			`## Editing`

			`In applications like text editors, you often need to re-parse a file after its source code has changed. Tree-sitter is designed to support this use case efficiently. There are two steps required. First, you must edit the syntax tree, which adjusts the ranges of its nodes so that they stay in sync with the code.`

			```c
			`typedef struct {`
			`uint32_t start_byte;`
			`uint32_t old_end_byte;`
			`uint32_t new_end_byte;`
			`TSPoint start_point;`
			`TSPoint old_end_point;`
			`TSPoint new_end_point;`
			`} TSInputEdit;`

Fix editing documentation, add note about ts_node_edit Fixes #242 2018-11-27 11:31:22 -08:00			`void ts_tree_edit(TSTree , const TSInputEdit );`
Expand using parsers section of the docs 2018-08-14 12:13:10 -07:00			```

			Then, you can call `ts_parser_parse` again, passing in the old tree. This will create a new tree that internally shares structure with the old tree.

Fix editing documentation, add note about ts_node_edit Fixes #242 2018-11-27 11:31:22 -08:00			When you edit a syntax tree, the positions of its nodes will change. If you have stored any `TSNode` instances outside of the `TSTree`, you must update their positions separately, using the same `TSInput` value, in order to update their cached positions.

			```c
			`void ts_node_edit(TSNode , const TSInputEdit );`
			```

			This `ts_node_edit` function is only needed in the case where you have retrieved `TSNode` instances before editing the tree, and then after editing the tree, you want to continue to use those specific node instances. Often, you'll just want to re-fetch nodes from the edited tree, in which case `ts_node_edit` is not needed.

Add documentation about included ranges 2018-09-12 17:54:09 -07:00			`## Multi-language Documents`

			`Sometimes, different parts of a file may be written in different languages. For example, templating languages like [EJS](http://ejs.co) and [ERB](https://ruby-doc.org/stdlib-2.5.1/libdoc/erb/rdoc/ERB.html) allow you to generate HTML by writing a mixture of HTML and another language like JavaScript or Ruby.`

			`Tree-sitter handles these types of documents by allowing you to create a syntax tree based on the text in certain ranges of a file.`

			```c
			`typedef struct {`
			`TSPoint start_point;`
			`TSPoint end_point;`
			`uint32_t start_byte;`
			`uint32_t end_byte;`
			`} TSRange;`

			`void ts_parser_set_included_ranges(`
			`TSParser *self,`
			`const TSRange *ranges,`
			`uint32_t range_count`
			`);`
			```

			`For example, consider this ERB document:`

			```erb
			`<ul>`
			`<% people.each do \|person\| %>`
			`<li><%= person.name %></li>`
			`<% end %>`
			`</ul>`
			```

Tweak docs for multi-language documents 2018-09-12 20:11:35 -07:00			`Conceptually, it can be represented by three syntax trees with overlapping ranges: an ERB syntax tree, a Ruby syntax tree, and an HTML syntax tree. You could generate these syntax trees with the following code:`
Add documentation about included ranges 2018-09-12 17:54:09 -07:00
			```c
			`#include <string.h>`
docs: Fix references to runtime.h, libruntime.a 2019-02-21 16:41:22 -08:00			`#include <tree_sitter/api.h>`
Add documentation about included ranges 2018-09-12 17:54:09 -07:00
			`// These functions are each implemented in their own repo.`
			`const TSLanguage *tree_sitter_embedded_template();`
			`const TSLanguage *tree_sitter_html();`
			`const TSLanguage *tree_sitter_ruby();`

			`int main(int argc, const char **argv) {`
			`const char *text = argv[1];`
			`unsigned len = strlen(src);`

			`// Parse the entire text as ERB.`
			`TSParser *parser = ts_parser_new();`
			`ts_parser_set_language(parser, tree_sitter_embedded_template());`
			`TSTree *erb_tree = ts_parser_parse_string(parser, NULL, text, len);`
			`TSNode erb_root_node = ts_tree_root_node(erb_tree);`

Tweak docs for multi-language documents 2018-09-12 20:11:35 -07:00			// In the ERB syntax tree, find the ranges of the `content` nodes,
			// which represent the underlying HTML, and the `code` nodes, which
			`// represent the interpolated Ruby.`
Add documentation about included ranges 2018-09-12 17:54:09 -07:00			`TSRange html_ranges[10];`
			`TSRange ruby_ranges[10];`
			`unsigned html_range_count = 0;`
			`unsigned ruby_range_count = 0;`
			`unsigned child_count = ts_node_child_count(erb_root_node);`

			`for (unsigned i = 0; i < child_count; i++) {`
			`TSNode node = ts_node_child(erb_root_node, i);`
			`if (strcmp(ts_node_type(node), "content") == 0) {`
			`html_ranges[html_range_count++] = (TSRange) {`
			`ts_node_start_point(node),`
			`ts_node_end_point(node),`
			`ts_node_start_byte(node),`
			`ts_node_end_byte(node),`
			`};`
			`} else {`
			`TSNode code_node = ts_node_named_child(node, 0);`
			`ruby_ranges[ruby_range_count++] = (TSRange) {`
			`ts_node_start_point(code_node),`
			`ts_node_end_point(code_node),`
			`ts_node_start_byte(code_node),`
			`ts_node_end_byte(code_node),`
			`};`
			`}`
			`}`

			`// Use the HTML ranges to parse the HTML.`
			`ts_parser_set_language(parser, tree_sitter_html());`
			`ts_parser_set_included_ranges(parser, html_ranges, html_range_count);`
			`TSTree *html_tree = ts_parser_parse_string(parser, NULL, text, len);`
			`TSNode html_root_node = ts_tree_root_node(html_tree);`

			`// Use the Ruby ranges to parse the Ruby.`
			`ts_parser_set_language(parser, tree_sitter_ruby());`
			`ts_parser_set_included_ranges(parser, ruby_ranges, ruby_range_count);`
			`TSTree *ruby_tree = ts_parser_parse_string(parser, NULL, text, len);`
			`TSNode ruby_root_node = ts_tree_root_node(ruby_tree);`

			`// Print all three trees.`
			`char *erb_sexp = ts_node_string(erb_root_node);`
			`char *html_sexp = ts_node_string(html_root_node);`
			`char *ruby_sexp = ts_node_string(ruby_root_node);`
			`printf("ERB: %s\n", erb_sexp);`
			`printf("HTML: %s\n", html_sexp);`
			`printf("Ruby: %s\n", ruby_sexp);`
			`return 0;`
			`}`
			```

Tweak docs for multi-language documents 2018-09-12 20:11:35 -07:00			`This API allows for great flexibility in how languages can be composed. Tree-sitter is not responsible for mediating the interactions between languages. Instead, you are free to do that using arbitrary application-specific logic.`

Expand using parsers section of the docs 2018-08-14 12:13:10 -07:00			`## Concurrency`

			`Tree-sitter supports multi-threaded use cases by making syntax trees very cheap to copy.`

			```c
			`TSTree ts_tree_copy(const TSTree );`
			```
Add docs 2018-06-11 19:17:10 -07:00
Expand using parsers section of the docs 2018-08-14 12:13:10 -07:00			Internally, copying a syntax tree just entails incrementing an atomic reference count. Conceptually, it provides you a new tree which you can freely query, edit, reparse, or delete on a new thread while continuing to use the original tree on a different thread. Note that individual `TSTree` instances are not thread safe; you must copy a tree if you want to use it on multiple threads simultaneously.