diff --git a/README.md b/README.md index 34390187..e74c6e45 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,11 @@ # tree-sitter -[![CICD](https://github.com/tree-sitter/tree-sitter/actions/workflows/CICD.yml/badge.svg)](https://github.com/tree-sitter/tree-sitter/actions/workflows/CICD.yml) +[![CICD badge]][CICD] [![DOI](https://zenodo.org/badge/14164618.svg)](https://zenodo.org/badge/latestdoi/14164618) +[CICD badge]: https://github.com/tree-sitter/tree-sitter/actions/workflows/CICD.yml/badge.svg +[CICD]: https://github.com/tree-sitter/tree-sitter/actions/workflows/CICD.yml + Tree-sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited. Tree-sitter aims to be: - **General** enough to parse any programming language diff --git a/cli/README.md b/cli/README.md index eff3608c..eb93bcfa 100644 --- a/cli/README.md +++ b/cli/README.md @@ -1,7 +1,11 @@ -Tree-sitter CLI -=============== +# Tree-sitter CLI -[![Crates.io](https://img.shields.io/crates/v/tree-sitter-cli.svg)](https://crates.io/crates/tree-sitter-cli) +[![crates.io badge]][crates.io] [![npmjs.com badge]][npmjs.com] + +[crates.io]: https://crates.io/crates/tree-sitter-cli +[crates.io badge]: https://img.shields.io/crates/v/tree-sitter-cli.svg?color=%23B48723 +[npmjs.com]: https://www.npmjs.org/package/tree-sitter-cli +[npmjs.com badge]: https://img.shields.io/npm/v/tree-sitter-cli.svg?color=%23BF4A4A The Tree-sitter CLI allows you to develop, test, and use Tree-sitter grammars from the command line. It works on MacOS, Linux, and Windows. @@ -19,7 +23,7 @@ or with `npm`: npm install tree-sitter-cli ``` -You can also download a pre-built binary for your platform from [the releases page](https://github.com/tree-sitter/tree-sitter/releases/latest). +You can also download a pre-built binary for your platform from [the releases page]. ### Dependencies @@ -30,8 +34,11 @@ The `tree-sitter` binary itself has no dependencies, but specific commands have ### Commands -* `generate` - The `tree-sitter generate` command will generate a Tree-sitter parser based on the grammar in the current working directory. See [the documentation](https://tree-sitter.github.io/tree-sitter/creating-parsers) for more information. +* `generate` - The `tree-sitter generate` command will generate a Tree-sitter parser based on the grammar in the current working directory. See [the documentation] for more information. -* `test` - The `tree-sitter test` command will run the unit tests for the Tree-sitter parser in the current working directory. See [the documentation](https://tree-sitter.github.io/tree-sitter/creating-parsers) for more information. +* `test` - The `tree-sitter test` command will run the unit tests for the Tree-sitter parser in the current working directory. See [the documentation] for more information. * `parse` - The `tree-sitter parse` command will parse a file (or list of files) using Tree-sitter parsers. + +[the documentation]: https://tree-sitter.github.io/tree-sitter/creating-parsers +[the releases page]: https://github.com/tree-sitter/tree-sitter/releases/latest diff --git a/docs/index.md b/docs/index.md index 9fb1fd2a..ddfff214 100644 --- a/docs/index.md +++ b/docs/index.md @@ -160,9 +160,9 @@ By convention, parsers are named with the language last, eg. tree-sitter-ruby. The design of Tree-sitter was greatly influenced by the following research papers: -- [Practical Algorithms for Incremental Software Development Environments](https://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/CSD-97-946.pdf) -- [Context Aware Scanning for Parsing Extensible Languages](https://www-users.cse.umn.edu/~evw/pubs/vanwyk07gpce/vanwyk07gpce.pdf) -- [Efficient and Flexible Incremental Parsing](https://harmonia.cs.berkeley.edu/papers/twagner-parsing.pdf) -- [Incremental Analysis of Real Programming Languages](https://harmonia.cs.berkeley.edu/papers/twagner-glr.pdf) -- [Error Detection and Recovery in LR Parsers](https://what-when-how.com/compiler-writing/bottom-up-parsing-compiler-writing-part-13) -- [Error Recovery for LR Parsers](https://apps.dtic.mil/sti/pdfs/ADA043470.pdf) +* [Practical Algorithms for Incremental Software Development Environments](https://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/CSD-97-946.pdf) +* [Context Aware Scanning for Parsing Extensible Languages](https://www-users.cse.umn.edu/~evw/pubs/vanwyk07gpce/vanwyk07gpce.pdf) +* [Efficient and Flexible Incremental Parsing](https://harmonia.cs.berkeley.edu/papers/twagner-parsing.pdf) +* [Incremental Analysis of Real Programming Languages](https://harmonia.cs.berkeley.edu/papers/twagner-glr.pdf) +* [Error Detection and Recovery in LR Parsers](https://what-when-how.com/compiler-writing/bottom-up-parsing-compiler-writing-part-13) +* [Error Recovery for LR Parsers](https://apps.dtic.mil/sti/pdfs/ADA043470.pdf) diff --git a/docs/section-2-using-parsers.md b/docs/section-2-using-parsers.md index 0d683dc1..87c049e7 100644 --- a/docs/section-2-using-parsers.md +++ b/docs/section-2-using-parsers.md @@ -21,21 +21,21 @@ Alternatively, you can incorporate the library in a larger project's build syste **source file:** -- `tree-sitter/lib/src/lib.c` +* `tree-sitter/lib/src/lib.c` **include directories:** -- `tree-sitter/lib/src` -- `tree-sitter/lib/include` +* `tree-sitter/lib/src` +* `tree-sitter/lib/include` ### The Basic Objects There are four main types of objects involved when using Tree-sitter: languages, parsers, syntax trees, and syntax nodes. In C, these are called `TSLanguage`, `TSParser`, `TSTree`, and `TSNode`. -- A `TSLanguage` is an opaque object that defines how to parse a particular programming language. The code for each `TSLanguage` is generated by Tree-sitter. Many languages are already available in separate git repositories within the [Tree-sitter GitHub organization](https://github.com/tree-sitter). See [the next page](./creating-parsers) for how to create new languages. -- A `TSParser` is a stateful object that can be assigned a `TSLanguage` and used to produce a `TSTree` based on some source code. -- A `TSTree` represents the syntax tree of an entire source code file. It contains `TSNode` instances that indicate the structure of the source code. It can also be edited and used to produce a new `TSTree` in the event that the source code changes. -- A `TSNode` represents a single node in the syntax tree. It tracks its start and end positions in the source code, as well as its relation to other nodes like its parent, siblings and children. +* A `TSLanguage` is an opaque object that defines how to parse a particular programming language. The code for each `TSLanguage` is generated by Tree-sitter. Many languages are already available in separate git repositories within the [Tree-sitter GitHub organization](https://github.com/tree-sitter). See [the next page](./creating-parsers) for how to create new languages. +* A `TSParser` is a stateful object that can be assigned a `TSLanguage` and used to produce a `TSTree` based on some source code. +* A `TSTree` represents the syntax tree of an entire source code file. It contains `TSNode` instances that indicate the structure of the source code. It can also be edited and used to produce a new `TSTree` in the event that the source code changes. +* A `TSNode` represents a single node in the syntax tree. It tracks its start and end positions in the source code, as well as its relation to other nodes like its parent, siblings and children. ### An Example Program @@ -442,13 +442,13 @@ Many code analysis tasks involve searching for patterns in syntax trees. Tree-si A _query_ consists of one or more _patterns_, where each pattern is an [S-expression](https://en.wikipedia.org/wiki/S-expression) that matches a certain set of nodes in a syntax tree. The expression to match a given node consists of a pair of parentheses containing two things: the node's type, and optionally, a series of other S-expressions that match the node's children. For example, this pattern would match any `binary_expression` node whose children are both `number_literal` nodes: -``` scheme +```scheme (binary_expression (number_literal) (number_literal)) ``` Children can also be omitted. For example, this would match any `binary_expression` where at least _one_ of child is a `string_literal` node: -``` scheme +```scheme (binary_expression (string_literal)) ``` @@ -456,7 +456,7 @@ Children can also be omitted. For example, this would match any `binary_expressi In general, it's a good idea to make patterns more specific by specifying [field names](#node-field-names) associated with child nodes. You do this by prefixing a child pattern with a field name followed by a colon. For example, this pattern would match an `assignment_expression` node where the `left` child is a `member_expression` whose `object` is a `call_expression`. -``` scheme +```scheme (assignment_expression left: (member_expression object: (call_expression))) @@ -464,9 +464,9 @@ In general, it's a good idea to make patterns more specific by specifying [field #### Negated Fields -You can also constrain a pattern so that it only matches nodes that *lack* a certain field. To do this, add a field name prefixed by a `!` within the parent pattern. For example, this pattern would match a class declaration with no type parameters: +You can also constrain a pattern so that it only matches nodes that _lack_ a certain field. To do this, add a field name prefixed by a `!` within the parent pattern. For example, this pattern would match a class declaration with no type parameters: -``` scheme +```scheme (class_declaration name: (identifier) @class_name !type_parameters) @@ -476,7 +476,7 @@ You can also constrain a pattern so that it only matches nodes that *lack* a cer The parenthesized syntax for writing nodes only applies to [named nodes](#named-vs-anonymous-nodes). To match specific anonymous nodes, you write their name between double quotes. For example, this pattern would match any `binary_expression` where the operator is `!=` and the right side is `null`: -``` scheme +```scheme (binary_expression operator: "!=" right: (null)) @@ -488,7 +488,7 @@ When matching patterns, you may want to process specific nodes within the patter For example, this pattern would match any assignment of a `function` to an `identifier`, and it would associate the name `the-function-name` with the identifier: -``` scheme +```scheme (assignment_expression left: (identifier) @the-function-name right: (function)) @@ -496,7 +496,7 @@ For example, this pattern would match any assignment of a `function` to an `iden And this pattern would match all method definitions, associating the name `the-method-name` with the method name, `the-class-name` with the containing class name: -``` scheme +```scheme (class_declaration name: (identifier) @the-class-name body: (class_body @@ -510,13 +510,13 @@ You can match a repeating sequence of sibling nodes using the postfix `+` and `* For example, this pattern would match a sequence of one or more comments: -``` scheme +```scheme (comment)+ ``` This pattern would match a class declaration, capturing all of the decorators if any were present: -``` scheme +```scheme (class_declaration (decorator)* @the-decorator name: (identifier) @the-name) @@ -524,7 +524,7 @@ This pattern would match a class declaration, capturing all of the decorators if You can also mark a node as optional using the `?` operator. For example, this pattern would match all function calls, capturing a string argument if one was present: -``` scheme +```scheme (call_expression function: (identifier) @the-function arguments: (arguments (string)? @the-string-arg)) @@ -534,7 +534,7 @@ You can also mark a node as optional using the `?` operator. For example, this p You can also use parentheses for grouping a sequence of _sibling_ nodes. For example, this pattern would match a comment followed by a function declaration: -``` scheme +```scheme ( (comment) (function_declaration) @@ -543,7 +543,7 @@ You can also use parentheses for grouping a sequence of _sibling_ nodes. For exa Any of the quantification operators mentioned above (`+`, `*`, and `?`) can also be applied to groups. For example, this pattern would match a comma-separated series of numbers: -``` scheme +```scheme ( (number) ("," (number))* @@ -558,7 +558,7 @@ This is similar to _character classes_ from regular expressions (`[abc]` matches For example, this pattern would match a call to either a variable or an object property. In the case of a variable, capture it as `@function`, and in the case of a property, capture it as `@method`: -``` scheme +```scheme (call_expression function: [ (identifier) @function @@ -569,7 +569,7 @@ In the case of a variable, capture it as `@function`, and in the case of a prope This pattern would match a set of possible keyword tokens, capturing them as `@keyword`: -``` scheme +```scheme [ "break" "delete" @@ -592,7 +592,7 @@ and `_` will match any named or anonymous node. For example, this pattern would match any node inside a call: -``` scheme +```scheme (call (_) @call.inner) ``` @@ -602,7 +602,7 @@ The anchor operator, `.`, is used to constrain the ways in which child patterns When `.` is placed before the _first_ child within a parent pattern, the child will only match when it is the first named node in the parent. For example, the below pattern matches a given `array` node at most once, assigning the `@the-element` capture to the first `identifier` node in the parent `array`: -``` scheme +```scheme (array . (identifier) @the-element) ``` @@ -610,13 +610,13 @@ Without this anchor, the pattern would match once for every identifier in the ar Similarly, an anchor placed after a pattern's _last_ child will cause that child pattern to only match nodes that are the last named child of their parent. The below pattern matches only nodes that are the last named child within a `block`. -``` scheme +```scheme (block (_) @last-expression .) ``` Finally, an anchor _between_ two child patterns will cause the patterns to only match nodes that are immediate siblings. The pattern below, given a long dotted name like `a.b.c.d`, will only match pairs of consecutive identifiers: `a, b`, `b, c`, and `c, d`. -``` scheme +```scheme (dotted_name (identifier) @prev-id . @@ -633,7 +633,7 @@ You can also specify arbitrary metadata and conditions associated with a pattern For example, this pattern would match identifier whose names is written in `SCREAMING_SNAKE_CASE`: -``` scheme +```scheme ( (identifier) @constant (#match? @constant "^[A-Z][A-Z_]+") @@ -642,7 +642,7 @@ For example, this pattern would match identifier whose names is written in `SCRE And this pattern would match key-value pairs where the `value` is an identifier with the same name as the key: -``` scheme +```scheme ( (pair key: (property_identifier) @key-name @@ -723,8 +723,8 @@ The node types file contains an array of objects, each of which describes a part Every object in this array has these two entries: -- `"type"` - A string that indicates which grammar rule the node represents. This corresponds to the `ts_node_type` function described [above](#syntax-nodes). -- `"named"` - A boolean that indicates whether this kind of node corresponds to a rule name in the grammar or just a string literal. See [above](#named-vs-anonymous-nodes) for more info. +* `"type"` - A string that indicates which grammar rule the node represents. This corresponds to the `ts_node_type` function described [above](#syntax-nodes). +* `"named"` - A boolean that indicates whether this kind of node corresponds to a rule name in the grammar or just a string literal. See [above](#named-vs-anonymous-nodes) for more info. Examples: @@ -745,14 +745,14 @@ Together, these two fields constitute a unique identifier for a node type; no tw Many syntax nodes can have _children_. The node type object describes the possible children that a node can have using the following entries: -- `"fields"` - An object that describes the possible [fields](#node-field-names) that the node can have. The keys of this object are field names, and the values are _child type_ objects, described below. -- `"children"` - Another _child type_ object that describes all of the node's possible _named_ children _without_ fields. +* `"fields"` - An object that describes the possible [fields](#node-field-names) that the node can have. The keys of this object are field names, and the values are _child type_ objects, described below. +* `"children"` - Another _child type_ object that describes all of the node's possible _named_ children _without_ fields. A _child type_ object describes a set of child nodes using the following entries: -- `"required"` - A boolean indicating whether there is always _at least one_ node in this set. -- `"multiple"` - A boolean indicating whether there can be _multiple_ nodes in this set. -- `"types"`- An array of objects that represent the possible types of nodes in this set. Each object has two keys: `"type"` and `"named"`, whose meanings are described above. +* `"required"` - A boolean indicating whether there is always _at least one_ node in this set. +* `"multiple"` - A boolean indicating whether there can be _multiple_ nodes in this set. +* `"types"`- An array of objects that represent the possible types of nodes in this set. Each object has two keys: `"type"` and `"named"`, whose meanings are described above. Example with fields: @@ -812,7 +812,7 @@ In Tree-sitter grammars, there are usually certain rules that represent abstract Normally, hidden rules are not mentioned in the node types file, since they don't appear in the syntax tree. But if you add a hidden rule to the grammar's [`supertypes` list](./creating-parsers#the-grammar-dsl), then it _will_ show up in the node types file, with the following special entry: -- `"subtypes"` - An array of objects that specify the _types_ of nodes that this 'supertype' node can wrap. +* `"subtypes"` - An array of objects that specify the _types_ of nodes that this 'supertype' node can wrap. Example: diff --git a/docs/section-3-creating-parsers.md b/docs/section-3-creating-parsers.md index eb664aec..0aa3b139 100644 --- a/docs/section-3-creating-parsers.md +++ b/docs/section-3-creating-parsers.md @@ -80,7 +80,9 @@ You can test this parser by creating a source file with the contents "hello" and echo 'hello' > example-file tree-sitter parse example-file ``` + Alternatively, in Windows PowerShell: + ```pwsh "hello" | Out-File example-file -Encoding utf8 tree-sitter parse example-file @@ -88,7 +90,7 @@ tree-sitter parse example-file This should print the following: -``` +```text (source_file [0, 0] - [1, 0]) ``` @@ -121,7 +123,7 @@ For each rule that you add to the grammar, you should first create a *test* that For example, you might have a file called `test/corpus/statements.txt` that contains a series of entries like this: -``` +```text ================== Return statements ================== @@ -147,7 +149,7 @@ func x() int { The expected output section can also *optionally* show the [*field names*][field-names-section] associated with each child node. To include field names in your tests, you write a node's field name followed by a colon, before the node itself in the S-expression: -``` +```text (source_file (function_definition name: (identifier) @@ -159,7 +161,7 @@ func x() int { * If your language's syntax conflicts with the `===` and `---` test separators, you can optionally add an arbitrary identical suffix (in the below example, `|||`) to disambiguate them: -``` +```text ==================||| Basic module ==================||| @@ -199,7 +201,7 @@ The `tree-sitter test` command will *also* run any syntax highlighting tests in You can run your parser on an arbitrary file using `tree-sitter parse`. This will print the resulting the syntax tree, including nodes' ranges and field names, like this: -``` +```text (source_file [0, 0] - [3, 0] (function_declaration [0, 0] - [2, 1] name: (identifier [0, 5] - [0, 9]) @@ -251,7 +253,6 @@ In addition to the `name` and `rules` fields, grammars have a few other optional * **`word`** - the name of a token that will match keywords for the purpose of the [keyword extraction](#keyword-extraction) optimization. * **`supertypes`** an array of hidden rule names which should be considered to be 'supertypes' in the generated [*node types* file][static-node-types]. - ## Writing the Grammar Writing a grammar requires creativity. There are an infinite number of CFGs (context-free grammars) that can be used to describe any given language. In order to produce a good Tree-sitter parser, you need to create a grammar with two important properties: @@ -375,7 +376,7 @@ return x + y; According to the specification, this line is a `ReturnStatement`, the fragment `x + y` is an `AdditiveExpression`, and `x` and `y` are both `IdentifierReferences`. The relationship between these constructs is captured by a complex series of production rules: -``` +```text ReturnStatement -> 'return' Expression Expression -> AssignmentExpression AssignmentExpression -> ConditionalExpression @@ -432,7 +433,7 @@ To produce a readable syntax tree, we'd like to model JavaScript expressions usi Of course, this flat structure is highly ambiguous. If we try to generate a parser, Tree-sitter gives us an error message: -``` +```text Error: Unresolved conflict for symbol sequence: '-' _expression • '*' … @@ -468,7 +469,7 @@ For an expression like `-a * b`, it's not clear whether the `-` operator applies Applying a higher precedence in `unary_expression` fixes that conflict, but there is still another conflict: -``` +```text Error: Unresolved conflict for symbol sequence: _expression '*' _expression • '*' … @@ -606,6 +607,7 @@ Aside from improving error detection, keyword extraction also has performance be ### External Scanners Many languages have some tokens whose structure is impossible or inconvenient to describe with a regular expression. Some examples: + * [Indent and dedent][indent-tokens] tokens in Python * [Heredocs][heredoc] in Bash and Ruby * [Percent strings][percent-string] in Ruby @@ -654,7 +656,6 @@ void * tree_sitter_my_language_external_scanner_create() { This function should create your scanner object. It will only be called once anytime your language is set on a parser. Often, you will want to allocate memory on the heap and return a pointer to it. If your external scanner doesn't need to maintain any state, it's ok to return `NULL`. - #### Destroy ```c @@ -714,10 +715,10 @@ This function is responsible for recognizing external tokens. It should return ` * **`void (*advance)(TSLexer *, bool skip)`** - A function for advancing to the next character. If you pass `true` for the second argument, the current character will be treated as whitespace; whitespace won't be included in the text range associated with tokens emitted by the external scanner. * **`void (*mark_end)(TSLexer *)`** - A function for marking the end of the recognized token. This allows matching tokens that require multiple characters of lookahead. By default (if you don't call `mark_end`), any character that you moved past using the `advance` function will be included in the size of the token. But once you call `mark_end`, then any later calls to `advance` will *not* increase the size of the returned token. You can call `mark_end` multiple times to increase the size of the token. * **`uint32_t (*get_column)(TSLexer *)`** - A function for querying the current column position of the lexer. It returns the number of codepoints since the start of the current line. The codepoint position is recalculated on every call to this function by reading from the start of the line. -* **`bool (*is_at_included_range_start)(const TSLexer *)`** - A function for checking whether the parser has just skipped some characters in the document. When parsing an embedded document using the `ts_parser_set_included_ranges` function (described in the [multi-language document section][multi-language-section]), your scanner may want to apply some special behavior when moving to a disjoint part of the document. For example, in [EJS documents][ejs], the JavaScript parser uses this function to enable inserting automatic semicolon tokens in between the code directives, delimited by `<%` and `%>`. +* **`bool (*is_at_included_range_start)(const TSLexer *)`** - A function for checking whether the parser has just skipped some characters in the document. When parsing an embedded document using the `ts_parser_set_included_ranges` function (described in the [multi-language document section][multi-language-section]), the scanner may want to apply some special behavior when moving to a disjoint part of the document. For example, in [EJS documents][ejs], the JavaScript parser uses this function to enable inserting automatic semicolon tokens in between the code directives, delimited by `<%` and `%>`. * **`bool (*eof)(const TSLexer *)`** - A function for determining whether the lexer is at the end of the file. The value of `lookahead` will be `0` at the end of a file, but this function should be used instead of checking for that value because the `0` or "NUL" value is also a valid character that could be present in the file being parsed. -The third argument to the `scan` function is an array of booleans that indicates which of your external tokens are currently expected by the parser. You should only look for a given token if it is valid according to this array. At the same time, you cannot backtrack, so you may need to combine certain pieces of logic. +The third argument to the `scan` function is an array of booleans that indicates which of external tokens are currently expected by the parser. You should only look for a given token if it is valid according to this array. At the same time, you cannot backtrack, so you may need to combine certain pieces of logic. ```c if (valid_symbols[INDENT] || valid_symbol[DEDENT]) { diff --git a/docs/section-4-syntax-highlighting.md b/docs/section-4-syntax-highlighting.md index 0cf7890f..cedd89a6 100644 --- a/docs/section-4-syntax-highlighting.md +++ b/docs/section-4-syntax-highlighting.md @@ -25,9 +25,9 @@ The Tree-sitter CLI automatically creates two directories in your home folder. These directories are created in the "normal" place for your platform: -- On Linux, `~/.config/tree-sitter` and `~/.cache/tree-sitter` -- On Mac, `~/Library/Application Support/tree-sitter` and `~/Library/Caches/tree-sitter` -- On Windows, `C:\Users\[username]\AppData\Roaming\tree-sitter` and `C:\Users\[username]\AppData\Local\tree-sitter` +* On Linux, `~/.config/tree-sitter` and `~/.cache/tree-sitter` +* On Mac, `~/Library/Application Support/tree-sitter` and `~/Library/Caches/tree-sitter` +* On Windows, `C:\Users\[username]\AppData\Roaming\tree-sitter` and `C:\Users\[username]\AppData\Local\tree-sitter` The CLI will work if there's no config file present, falling back on default values for each configuration option. To create a config file that you can edit, run this command: @@ -61,6 +61,7 @@ In your config file, the `"theme"` value is an object whose keys are dot-separat #### Highlight Names A theme can contain multiple keys that share a common subsequence. Examples: + * `variable` and `variable.parameter` * `function`, `function.builtin`, and `function.method` @@ -158,7 +159,7 @@ func increment(a int) int { With this syntax tree: -``` +```scheme (source_file (function_declaration name: (identifier) @@ -178,6 +179,7 @@ With this syntax tree: #### Example Query Suppose we wanted to render this code with the following colors: + * keywords `func` and `return` in purple * function `increment` in blue * type `int` in green @@ -185,7 +187,7 @@ Suppose we wanted to render this code with the following colors: We can assign each of these categories a *highlight name* using a query like this: -``` +```scheme ; highlights.scm "func" @keyword @@ -252,7 +254,7 @@ list = [item] With this syntax tree: -``` +```scheme (program (method name: (identifier) @@ -295,7 +297,7 @@ There are several different types of names within this method: Let's write some queries that let us clearly distinguish between these types of names. First, set up the highlighting query, as described in the previous section. We'll assign distinct colors to method calls, method definitions, and formal parameters: -``` +```scheme ; highlights.scm (call method: (identifier) @function.method) @@ -312,7 +314,7 @@ Let's write some queries that let us clearly distinguish between these types of Then, we'll set up a local variable query to keep track of the variables and scopes. Here, we're indicating that methods and blocks create local *scopes*, parameters and assignments create *definitions*, and other identifiers should be considered *references*: -``` +```scheme ; locals.scm (method) @local.scope @@ -345,6 +347,7 @@ Running `tree-sitter highlight` on this ruby file would produce output like this ### Language Injection Some source files contain code written in multiple different languages. Examples include: + * HTML files, which can contain JavaScript inside of `