From 0109c877d57cdec5ea8ff70f4a76215a01b8b672 Mon Sep 17 00:00:00 2001
From: Amaan Qureshi <amaanq12@gmail.com>
Date: Sun, 6 Aug 2023 22:30:21 -0400
Subject: [PATCH] docs: document regex limitations

---
 docs/section-3-creating-parsers.md | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/docs/section-3-creating-parsers.md b/docs/section-3-creating-parsers.md
index dd6ef102..61e31b23 100644
--- a/docs/section-3-creating-parsers.md
+++ b/docs/section-3-creating-parsers.md
@@ -229,6 +229,20 @@ The following is a complete list of built-in functions you can use in your `gram
 
 * **Symbols (the `$` object)** - Every grammar rule is written as a JavaScript function that takes a parameter conventionally called `$`. The syntax `$.identifier` is how you refer to another grammar symbol within a rule. Names starting with `$.MISSING` or `$.UNEXPECTED` should be avoided as they have special meaning for the `tree-sitter test` command.
 * **String and Regex literals** - The terminal symbols in a grammar are described using JavaScript strings and regular expressions. Of course during parsing, Tree-sitter does not actually use JavaScript's regex engine to evaluate these regexes; it generates its own regex-matching logic as part of each parser. Regex literals are just used as a convenient way of writing regular expressions in your grammar.
+* **Regex Limitations** - Currently, only a subset of the Regex engine is actually
+supported. This is due to certain features like lookahead and lookaround assertions
+not feasible to use in an LR(1) grammar, as well as certain flags being unnecessary
+for tree-sitter. However, plenty of features are supported by default:
+
+  * Character classes
+  * Character ranges
+  * Character sets
+  * Quantifiers
+  * Alternation
+  * Grouping
+  * Unicode character escapes
+  * Unicode property escapes
+
 * **Sequences : `seq(rule1, rule2, ...)`** - This function creates a rule that matches any number of other rules, one after another. It is analogous to simply writing multiple symbols next to each other in [EBNF notation][ebnf].
 * **Alternatives : `choice(rule1, rule2, ...)`** - This function creates a rule that matches *one* of a set of possible rules. The order of the arguments does not matter. This is analogous to the `|` (pipe) operator in EBNF notation.
 * **Repetitions : `repeat(rule)`** - This function creates a rule that matches *zero-or-more* occurrences of a given rule. It is analogous to the `{x}` (curly brace) syntax in EBNF notation.