From 579b8e8d2822a38d6b7e8a9fb30589ad48040e33 Mon Sep 17 00:00:00 2001 From: Max Brunsfeld Date: Wed, 12 Feb 2020 09:29:49 -0800 Subject: [PATCH 01/10] Rename files to make room for syntax highlighting section --- docs/section-4-syntax-highlighting.md | 6 ++++++ ...tion-4-implementation.md => section-5-implementation.md} | 0 ...{section-5-contributing.md => section-6-contributing.md} | 0 ...{section-6-playground.html => section-7-playground.html} | 0 4 files changed, 6 insertions(+) create mode 100644 docs/section-4-syntax-highlighting.md rename docs/{section-4-implementation.md => section-5-implementation.md} (100%) rename docs/{section-5-contributing.md => section-6-contributing.md} (100%) rename docs/{section-6-playground.html => section-7-playground.html} (100%) diff --git a/docs/section-4-syntax-highlighting.md b/docs/section-4-syntax-highlighting.md new file mode 100644 index 00000000..0155a9a2 --- /dev/null +++ b/docs/section-4-syntax-highlighting.md @@ -0,0 +1,6 @@ +--- +title: Syntax Highlighting +permalink: syntax-highlighting +--- + +# Syntax Highlighting diff --git a/docs/section-4-implementation.md b/docs/section-5-implementation.md similarity index 100% rename from docs/section-4-implementation.md rename to docs/section-5-implementation.md diff --git a/docs/section-5-contributing.md b/docs/section-6-contributing.md similarity index 100% rename from docs/section-5-contributing.md rename to docs/section-6-contributing.md diff --git a/docs/section-6-playground.html b/docs/section-7-playground.html similarity index 100% rename from docs/section-6-playground.html rename to docs/section-7-playground.html From 2a8860542c344d72eee65b3d6a0dafc1fb9c5e10 Mon Sep 17 00:00:00 2001 From: Max Brunsfeld Date: Wed, 12 Feb 2020 09:42:56 -0800 Subject: [PATCH 02/10] Add intro to syntax highlighting docs --- docs/section-4-syntax-highlighting.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/section-4-syntax-highlighting.md b/docs/section-4-syntax-highlighting.md index 0155a9a2..44e6e6de 100644 --- a/docs/section-4-syntax-highlighting.md +++ b/docs/section-4-syntax-highlighting.md @@ -4,3 +4,7 @@ permalink: syntax-highlighting --- # Syntax Highlighting + +Syntax highlighting is a very common feature in applications that deal with code. Tree-sitter has built-in support for syntax highlighting, via the [`tree-sitter-highlight`](https://github.com/tree-sitter/tree-sitter/tree/master/highlight) library. This system is currently used on GitHub.com for highlighting code written in several langauges. + +**Note - If you are working on syntax highlighting in the [Atom](https://atom.io/) text editor, you should consult [this page](https://flight-manual.atom.io/hacking-atom/sections/creating-a-grammar/) in the Atom Flight Manual. Atom currently uses a different syntax highlighting system that is also based on Tree-sitter, but is older than the one described in this document.** From 92f060303c0231a584430dd780b65bcec394098b Mon Sep 17 00:00:00 2001 From: Max Brunsfeld Date: Wed, 12 Feb 2020 15:49:57 -0800 Subject: [PATCH 03/10] docs: First draft of highlight query section, start local var section --- docs/section-4-syntax-highlighting.md | 104 +++++++++++++++++++++++++- 1 file changed, 103 insertions(+), 1 deletion(-) diff --git a/docs/section-4-syntax-highlighting.md b/docs/section-4-syntax-highlighting.md index 44e6e6de..96516e4b 100644 --- a/docs/section-4-syntax-highlighting.md +++ b/docs/section-4-syntax-highlighting.md @@ -5,6 +5,108 @@ permalink: syntax-highlighting # Syntax Highlighting -Syntax highlighting is a very common feature in applications that deal with code. Tree-sitter has built-in support for syntax highlighting, via the [`tree-sitter-highlight`](https://github.com/tree-sitter/tree-sitter/tree/master/highlight) library. This system is currently used on GitHub.com for highlighting code written in several langauges. +Syntax highlighting is a very common feature in applications that deal with code. Tree-sitter has built-in support for syntax highlighting, via the [`tree-sitter-highlight`](https://github.com/tree-sitter/tree-sitter/tree/master/highlight) library. This system is currently used on GitHub.com for highlighting code written in several languages. **Note - If you are working on syntax highlighting in the [Atom](https://atom.io/) text editor, you should consult [this page](https://flight-manual.atom.io/hacking-atom/sections/creating-a-grammar/) in the Atom Flight Manual. Atom currently uses a different syntax highlighting system that is also based on Tree-sitter, but is older than the one described in this document.** + +## Overview + +Tree-sitter's syntax highlighting system is based on *tree queries*, which are a general system for pattern-matching on Tree-sitter's syntax trees. See [this section](./using-parsers#pattern-matching-with-queries) of the documentation for more information about tree queries. + +Syntax highlighting queries for a given language are normally included in the same git repository as the Tree-sitter grammar for that language, in a top-level directory called `queries`. For an example, see the `queries` directory in the [`tree-sitter-ruby` repository](https://github.com/tree-sitter/tree-sitter-ruby/tree/master/queries). + +Highlighting is controlled by *three* different types of query files that can be included in the `queries` folder. + +* The highlights query (required, with default name `highlights.scm`) +* The local variable query (optional, with default name `locals.scm`) +* The language injection query (optional, with default name `injections.scm`) + +## Highlights Query + +The most important query is called the highlights query. The highlights query uses *captures* to assign arbitrary *highlight names* to different nodes in the tree. Each highlight name can then be mapped to a color. Commonly used highlight names include `keyword`, `function`, `type`, `property`, and `string`. Names can also be dot-separated like `function.builtin`. + +For example, consider the following Go code: + +```go +func increment(a int) int { + return a + 1 +} +``` + +With this syntax tree: + +``` +(source_file + (function_declaration + name: (identifier) + parameters: (parameter_list + (parameter_declaration + name: (identifier) + type: (type_identifier))) + result: (type_identifier) + body: (block + (return_statement + (expression_list + (binary_expression + left: (identifier) + right: (int_literal))))))) +``` + +Suppose we wanted to render this code with the following colors: +* keywords `func` and `return` in purple +* function `increment` in blue +* type `int` in green +* number `5` brown + +We can assign each of these categories a *highlight name* using a query like this: + +``` +"func" @keyword +"return" @keyword + +(function_declaration + name: (identifier) @function) + +(type_identifier) @type + +(int_literal) @number +``` + +And we could map each of these highlight names to a color: + +```json +{ + "theme": { + "keyword": "purple", + "function": "blue", + "type": "green", + "number": "brown" + } +} +``` + +## Local Variable Query + +Good syntax highlighting helps the reader to quickly distinguish between the different types of 'entities' in their code. Ideally, if a given entity appears in *multiple* places, it should be colored the same in each place. The Tree-sitter syntax highlighting system can help you to achieve this by keeping track of local scopes and variables. + +The *local variables* query is different from the highlights query in that, while the highlights query uses *arbitrary* capture names which can then be mapped to colors, the locals variable query uses a fixed set of capture names, each of which has a special meaning. + +The capture names are as follows: + +* `@local.scope` - indicates that a syntax node introduces a new local scope. +* `@local.definition` - indicates that a syntax node contains the *name* of a definition within the current local scope. +* `@local.reference` - indicates that a syntax node contains the *name* which *may* refer to an earlier definition within some enclosing scope. + +When highlighting a file, Tree-sitter will keep track of the set of scopes that contains any given position, and the set of definitions within each scope. When processing a syntax node that is captured as a `local.reference`, Tree-sitter will try to find a definition for a name that that matches the node's text. If it finds a match, Tree-sitter will ensure that the *reference* and the *definition* are colored the same. + +For example, consider this Ruby code: + +``` +def increment_all(list) + list.map do |item| + item + 1 + end +end +``` + +## Language Injection Query From 360b1886441fd1998c288c56758c574081b0c900 Mon Sep 17 00:00:00 2001 From: Max Brunsfeld Date: Wed, 12 Feb 2020 17:23:08 -0800 Subject: [PATCH 04/10] cli: Handle 'underline' styling when highlighting w/ HTML output --- cli/src/highlight.rs | 3 +++ 1 file changed, 3 insertions(+) diff --git a/cli/src/highlight.rs b/cli/src/highlight.rs index 2fa2e8b0..7828069c 100644 --- a/cli/src/highlight.rs +++ b/cli/src/highlight.rs @@ -230,6 +230,9 @@ fn parse_color(json: Value) -> Option { fn style_to_css(style: ansi_term::Style) -> String { use std::fmt::Write; let mut result = "style='".to_string(); + if style.is_underline { + write!(&mut result, "text-decoration: underline;").unwrap(); + } if style.is_bold { write!(&mut result, "font-weight: bold;").unwrap(); } From cf80f094acd83b2ee080792d50f6a5cea1cd437b Mon Sep 17 00:00:00 2001 From: Max Brunsfeld Date: Wed, 12 Feb 2020 17:23:29 -0800 Subject: [PATCH 05/10] docs: Expand local variable highlighting section --- docs/section-4-syntax-highlighting.md | 129 +++++++++++++++++++++++--- 1 file changed, 118 insertions(+), 11 deletions(-) diff --git a/docs/section-4-syntax-highlighting.md b/docs/section-4-syntax-highlighting.md index 96516e4b..a8dbb5a0 100644 --- a/docs/section-4-syntax-highlighting.md +++ b/docs/section-4-syntax-highlighting.md @@ -21,10 +21,14 @@ Highlighting is controlled by *three* different types of query files that can be * The local variable query (optional, with default name `locals.scm`) * The language injection query (optional, with default name `injections.scm`) +The default names for the query files use the `.scm` file. We chose this extension because it commonly used for files written in [Scheme](https://en.wikipedia.org/wiki/Scheme_%28programming_language%29), a popular dialect of Lisp, and these query files use a Lisp-like syntax. But alternatively, you can think of `SCM` as an acronym for "Source Code Matching". + ## Highlights Query The most important query is called the highlights query. The highlights query uses *captures* to assign arbitrary *highlight names* to different nodes in the tree. Each highlight name can then be mapped to a color. Commonly used highlight names include `keyword`, `function`, `type`, `property`, and `string`. Names can also be dot-separated like `function.builtin`. +#### Example Input + For example, consider the following Go code: ```go @@ -52,6 +56,8 @@ With this syntax tree: right: (int_literal))))))) ``` +#### Example Query + Suppose we wanted to render this code with the following colors: * keywords `func` and `return` in purple * function `increment` in blue @@ -60,16 +66,14 @@ Suppose we wanted to render this code with the following colors: We can assign each of these categories a *highlight name* using a query like this: -``` +```clj +; highlights.scm + "func" @keyword "return" @keyword - -(function_declaration - name: (identifier) @function) - (type_identifier) @type - (int_literal) @number +(function_declaration name: (identifier) @function) ``` And we could map each of these highlight names to a color: @@ -85,9 +89,17 @@ And we could map each of these highlight names to a color: } ``` +#### Result + +
+func increment(a int) int {
+    return a + 1
+}
+
+ ## Local Variable Query -Good syntax highlighting helps the reader to quickly distinguish between the different types of 'entities' in their code. Ideally, if a given entity appears in *multiple* places, it should be colored the same in each place. The Tree-sitter syntax highlighting system can help you to achieve this by keeping track of local scopes and variables. +Good syntax highlighting helps the reader to quickly distinguish between the different types of *entities* in their code. Ideally, if a given entity appears in *multiple* places, it should be colored the same in each place. The Tree-sitter syntax highlighting system can help you to achieve this by keeping track of local scopes and variables. The *local variables* query is different from the highlights query in that, while the highlights query uses *arbitrary* capture names which can then be mapped to colors, the locals variable query uses a fixed set of capture names, each of which has a special meaning. @@ -99,14 +111,109 @@ The capture names are as follows: When highlighting a file, Tree-sitter will keep track of the set of scopes that contains any given position, and the set of definitions within each scope. When processing a syntax node that is captured as a `local.reference`, Tree-sitter will try to find a definition for a name that that matches the node's text. If it finds a match, Tree-sitter will ensure that the *reference* and the *definition* are colored the same. -For example, consider this Ruby code: +#### Example Input -``` -def increment_all(list) +Consider this Ruby code: + +```ruby +def process_list(list) + context = current_context list.map do |item| - item + 1 + process_item(item, context) end end + +item = 5 +list = [item] ``` +With this syntax tree: + +``` +(program + (method + name: (identifier) + parameters: (method_parameters + (identifier)) + (assignment + left: (identifier) + right: (identifier)) + (method_call + method: (call + receiver: (identifier) + method: (identifier)) + block: (do_block + (block_parameters + (identifier)) + (method_call + method: (identifier) + arguments: (argument_list + (identifier) + (identifier)))))) + (assignment + left: (identifier) + right: (integer)) + (assignment + left: (identifier) + right: (array + (identifier)))) +``` + +There are several different types of names within this method: + +* `process_list` is a method. +* Within this method, `list` is a formal parameter +* `context` is a local variable. +* `current_context` is *not* a local variable, so it must be a method. +* Within the `do` block, `item` is a formal parameter +* Later on, `item` and `list` are both local variables (not formal parameters). + +#### Example Queries + +Let's write some queries that let us clearly distinguish between these types of names. First, set up the highlighting query, as described in the previous section. We'll assign distinct colors to method calls, method definitions, and formal parameters: + +```clj +; highlights.scm + +(call method: (identifier) @function.method) +(method_call method: (identifier) @function.method) + +(method name: (identifier) @function.method) + +(method_parameters (identifier) @variable.parameter) +(block_parameters (identifier) @variable.parameter) +``` + +Then, we'll set up a local variable query to keep track of the variables and scopes. Here, we're indicating that methods and blocks create local *scopes*, parameters and assignments create *definitions*, and other identifiers should be considered *references*: + +```clj +; locals.scm + +(method) @local.scope +(do_block) @local.scope + +(method_parameters (identifier) @local.definition) +(block_parameters (identifier) @local.definition) + +(assignment left:(identifier) @local.definition) + +(identifier) @local.reference +``` + +#### Result + + + +
+def process_list(list)
+  context = current_context
+  list.map do |item|
+    process_item(item, context)
+  end
+end
+
+item = 5
+list = [item]
+
+ ## Language Injection Query From 17071267e38c1abf2688a1ee666bb2f0ba45835e Mon Sep 17 00:00:00 2001 From: Max Brunsfeld Date: Thu, 20 Feb 2020 14:38:37 -0800 Subject: [PATCH 06/10] docs: Start work on docs for injection queries --- docs/section-4-syntax-highlighting.md | 60 ++++++++++++++++++++++++--- 1 file changed, 54 insertions(+), 6 deletions(-) diff --git a/docs/section-4-syntax-highlighting.md b/docs/section-4-syntax-highlighting.md index a8dbb5a0..90697ecd 100644 --- a/docs/section-4-syntax-highlighting.md +++ b/docs/section-4-syntax-highlighting.md @@ -66,7 +66,7 @@ Suppose we wanted to render this code with the following colors: We can assign each of these categories a *highlight name* using a query like this: -```clj +``` ; highlights.scm "func" @keyword @@ -92,7 +92,7 @@ And we could map each of these highlight names to a color: #### Result
-func increment(a int) int {
+func increment(a int) int {
     return a + 1
 }
 
@@ -111,6 +111,8 @@ The capture names are as follows: When highlighting a file, Tree-sitter will keep track of the set of scopes that contains any given position, and the set of definitions within each scope. When processing a syntax node that is captured as a `local.reference`, Tree-sitter will try to find a definition for a name that that matches the node's text. If it finds a match, Tree-sitter will ensure that the *reference* and the *definition* are colored the same. +The information produced by this query can also be *used* by the highlights query. You can *disable* a pattern for nodes which have been identified as local variables by adding the predicate `(is-not? local)` to the pattern. This is used in the example below: + #### Example Input Consider this Ruby code: @@ -172,7 +174,7 @@ There are several different types of names within this method: Let's write some queries that let us clearly distinguish between these types of names. First, set up the highlighting query, as described in the previous section. We'll assign distinct colors to method calls, method definitions, and formal parameters: -```clj +``` ; highlights.scm (call method: (identifier) @function.method) @@ -182,11 +184,14 @@ Let's write some queries that let us clearly distinguish between these types of (method_parameters (identifier) @variable.parameter) (block_parameters (identifier) @variable.parameter) + +((identifier) @function.method + (is-not? local)) ``` Then, we'll set up a local variable query to keep track of the variables and scopes. Here, we're indicating that methods and blocks create local *scopes*, parameters and assignments create *definitions*, and other identifiers should be considered *references*: -```clj +``` ; locals.scm (method) @local.scope @@ -202,8 +207,6 @@ Then, we'll set up a local variable query to keep track of the variables and sco #### Result - -
 def process_list(list)
   context = current_context
@@ -217,3 +220,48 @@ Then, we'll set up a local variable query to keep track of the variables and sco
 
## Language Injection Query + +Some source files contain code written in multiple different languages. Examples include: +* HTML files, which can contain JavaScript inside of `