From bfd56a1e59c7cfe000eccc8f2142c8eaa12a552f Mon Sep 17 00:00:00 2001 From: Andrew Hlynskyi Date: Fri, 14 Apr 2023 03:43:03 +0300 Subject: [PATCH 1/2] docs: remove controversial `Earliest Starting Position` item added previously by 87a0517 --- docs/section-3-creating-parsers.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/section-3-creating-parsers.md b/docs/section-3-creating-parsers.md index 842b87eb..a67cfc0c 100644 --- a/docs/section-3-creating-parsers.md +++ b/docs/section-3-creating-parsers.md @@ -530,8 +530,6 @@ Grammars often contain multiple tokens that can match the same characters. For e 1. **Context-Aware Lexing** - Tree-sitter performs lexing on-demand, during the parsing process. At any given position in a source document, the lexer only tries to recognize tokens that are *valid* at that position in the document. -1. **Earliest Starting Position** - Tree-sitter will prefer tokens with an earlier starting position. This is most often seen with very permissive regular expressions similar to `/.*/`, which are greedy and will consume as much text as possible. In this example the regex would consume all text until hitting a newline - even if text on that line could be interpreted as a different token. - 1. **Explicit Lexical Precedence** - When the precedence functions described [above](#the-grammar-dsl) are used within the `token` function, the given precedence values serve as instructions to the lexer. If there are two valid tokens that match the characters at a given position in the document, Tree-sitter will select the one with the higher precedence. 1. **Match Length** - If multiple valid tokens with the same precedence match the characters at a given position in a document, Tree-sitter will select the token that matches the [longest sequence of characters][longest-match]. From b5e6d1808613509ac11af25fe3bb243b1b6d6eb8 Mon Sep 17 00:00:00 2001 From: Andrew Hlynskyi Date: Fri, 14 Apr 2023 04:16:00 +0300 Subject: [PATCH 2/2] docs: add a grammar syntax sample for lexical precedence --- docs/section-3-creating-parsers.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/section-3-creating-parsers.md b/docs/section-3-creating-parsers.md index a67cfc0c..eb664aec 100644 --- a/docs/section-3-creating-parsers.md +++ b/docs/section-3-creating-parsers.md @@ -530,7 +530,7 @@ Grammars often contain multiple tokens that can match the same characters. For e 1. **Context-Aware Lexing** - Tree-sitter performs lexing on-demand, during the parsing process. At any given position in a source document, the lexer only tries to recognize tokens that are *valid* at that position in the document. -1. **Explicit Lexical Precedence** - When the precedence functions described [above](#the-grammar-dsl) are used within the `token` function, the given precedence values serve as instructions to the lexer. If there are two valid tokens that match the characters at a given position in the document, Tree-sitter will select the one with the higher precedence. +1. **Explicit Lexical Precedence** - When the precedence functions described [above](#the-grammar-dsl) are used within the `token` function like `token(prec(N, ...))`, the given precedence values serve as instructions to the lexer. If there are two valid tokens that match the characters at a given position in the document, Tree-sitter will select the one with the higher precedence. 1. **Match Length** - If multiple valid tokens with the same precedence match the characters at a given position in a document, Tree-sitter will select the token that matches the [longest sequence of characters][longest-match].