Commit graph

166 commits

Author SHA1 Message Date
Max Brunsfeld
ee2906ac2e Don't merge stack versions that are halted 2017-09-12 16:19:28 -07:00
Max Brunsfeld
819235bac3 Limit the number of stack nodes that are included in a summary 2017-09-12 12:00:00 -07:00
Max Brunsfeld
99d048e016 Simplify error recovery; eliminate recovery states
The previous approach to error recovery relied on special error-recovery
states in the parse table. For each token T, there was an error recovery
state in which the parser looked for *any* token that could follow T.
Unfortunately, sometimes the set of tokens that could follow T contained
conflicts. For example, in JS, the token '}' can be followed by the
open-ended 'template_chars' token, but also by ordinary tokens like
'identifier'. So with the old algorithm, when recovering from an
unexpected '}' token, the lexer had no way to distinguish identifiers
from template_chars.

This commit drops the error recovery states. Instead, when we encounter
an unexpected token T, we recover from the error by finding a previous
state S in the stack in which T would be valid, popping all of the nodes
after S, and wrapping them in an error.

This way, the lexer is always invoked in a normal parse state, in which
it is looking for a non-conflicting set of tokens. Eliminating the error
recovery states also shrinks the lex state machine significantly.

Signed-off-by: Rick Winfrey <rewinfrey@github.com>
2017-09-11 15:22:52 -07:00
Max Brunsfeld
f0e63adc84 Use __forceinline keyword instead of always_inline attribute on windows 2017-08-07 12:44:33 -07:00
Max Brunsfeld
b98669c7e6 Replace general array_reverse with ts_tree_array_reverse 2017-08-07 12:44:33 -07:00
Max Brunsfeld
9a04231ab1 Remove length restriction in external scanner serialization API 2017-07-17 17:12:36 -07:00
Max Brunsfeld
7293e6f0cc Fix compile warnings 2017-07-12 22:08:36 -07:00
Max Brunsfeld
d322f0b6a7 🎨 2017-07-04 21:59:54 -07:00
Max Brunsfeld
89e5037f01 Manually tail-call-optimize stack_node_release function 2017-06-30 17:49:09 -07:00
Max Brunsfeld
eccb3893eb Prune unneeded stack versions based on a depth criteria 2017-06-30 17:49:09 -07:00
Max Brunsfeld
061fba6b92 🎨 Just call it 'inline' 2017-06-29 16:49:59 -07:00
Max Brunsfeld
a89322c5f1 Remove unneeded parameters from public interface of stack_iterate callback 2017-06-29 16:43:56 -07:00
Max Brunsfeld
009d6d1534 Improve heuristics for pruning parse versions based on errors
* Rewrite the error cost comparison in terms of explicit, discrete
conditions.
* Allow merging versions have different error costs.
* Store the depth of each stack version since the last error. Use this
state to prevent incorrect merging.
* Sort the stack versions in order of preference and put a hard limit on
the version count.
2017-06-29 15:00:20 -07:00
Max Brunsfeld
445be0736a Clean up ts_stack_push function 2017-06-29 15:00:20 -07:00
Phil Turnbull
5ef9c4d6aa Increase size of ref_counts and add assertions
This bumps the size of the reference counts from 16- to 32-bit counters to make
it less likely to overflow. Also assert in the retain function that the
reference count didn't overflow.

32-bits seems big enough for non-pathological examples but a more fool-proof
fix may be to bump it to 64-bits.
2017-06-29 06:39:01 -07:00
Max Brunsfeld
dfd7b1f5f6 Consolidate memory management logic in Stack 2017-06-27 16:17:24 -07:00
Max Brunsfeld
0143bfdad4 Avoid use-after-free of external token states
Previously, it was possible for references to external token states to
outlive the trees to which those states belonged.

Now, instead of storing references to external token states in the Stack
and in the Lexer, we store references to the external token trees
themselves, and we retain the trees to prevent use-after-free.
2017-06-27 14:54:27 -07:00
Max Brunsfeld
f678018d3d Avoid use-after-free when copying stack iterators 2017-06-27 14:54:27 -07:00
Max Brunsfeld
7b401de5a6 Don't use pointer equality to compare external token states 2017-05-03 09:57:09 -07:00
Max Brunsfeld
df520635c6 Prevent crash due to huge number of possible paths through parse stack 2017-02-20 14:34:10 -08:00
Max Brunsfeld
dc6598e07e Include external token states in stack debug graphs 2017-01-30 21:58:27 -08:00
Max Brunsfeld
36608180d2 Store external token states in the parse stack 2017-01-08 22:06:05 -08:00
Max Brunsfeld
d57043b665 Add ability to store external token state per stack version 2017-01-04 21:22:23 -08:00
Max Brunsfeld
e7217f1bac Clean up some methods in parser.c 2016-11-14 17:25:55 -08:00
Max Brunsfeld
535879a2bd Represent byte, char and tree counts as 32 bit numbers
The parser spends the majority of its time allocating and freeing trees and stack nodes.
Also, the memory footprint of the AST is a significant concern when using tree-sitter
with large files. This library is already unlikely to work very well with source files
larger than 4GB, so representing rows, columns, byte lengths and child indices as
unsigned 32 bit integers seems like the right choice.
2016-11-14 12:19:13 -08:00
Max Brunsfeld
c9dcb29c6f Remove the TS prefix from some internal type/function names 2016-11-09 20:59:05 -08:00
Max Brunsfeld
ca45acd6af Suppress 'value computed is not used' warning on gcc 2016-11-05 21:23:03 -07:00
Max Brunsfeld
4106ecda43 Remove logic for recovering from OOM 2016-11-04 09:18:38 -07:00
Max Brunsfeld
eed54d95e1 Merge branch 'master' into changed-ranges 2016-10-16 21:10:25 -07:00
Max Brunsfeld
e149d94ff5 Remove generated parsers' dependency on runtime.h 2016-10-05 14:02:49 -07:00
Max Brunsfeld
cc62fe0375 Represent Lengths in terms of Points 2016-09-09 21:11:02 -07:00
Max Brunsfeld
d31934ac77 Avoid potential use after free in stack__iter 2016-09-05 21:41:33 -07:00
Max Brunsfeld
e0b0e29a2b Update parse count correctly when repairing errors & undoing reductions 2016-09-01 10:04:20 -07:00
Max Brunsfeld
71fdd9aa49 Remove error_depth tracking from the stack 2016-08-31 17:30:16 -07:00
Max Brunsfeld
7483da4184 Add push_count to stack, use it in error comparisons 2016-08-31 17:29:14 -07:00
Max Brunsfeld
0faae52132 Fix some inconsistencies in error cost calculation
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-08-31 10:51:59 -07:00
Max Brunsfeld
1d0f6c3cc0 Rename error_size -> error_cost 2016-08-30 11:09:12 -07:00
Max Brunsfeld
52ccebbf80 Rename error_depth -> error_count 2016-08-30 09:44:40 -07:00
Max Brunsfeld
31d1160e21 Base error costs on top-level trees skipped and lines of text skipped
Rather than on the total number of tokens skipped
2016-08-29 17:06:23 -07:00
Max Brunsfeld
27c9cb4175 Show anonymous tokens in quotes in dot graphs
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-08-29 16:39:14 -07:00
Max Brunsfeld
8c26d99353 Store error recovery actions in the normal parse table 2016-06-27 14:07:47 -07:00
Max Brunsfeld
9538b5b879 Don't count extra trees toward stack versions' error costs 2016-06-26 22:46:40 -07:00
Max Brunsfeld
76975f56ec Display node positions in tooltips in debugging graphs 2016-06-15 11:06:00 -07:00
Max Brunsfeld
f69d709650 Remove unused functions 2016-06-15 10:17:54 -07:00
Max Brunsfeld
ecc7399ed3 Fix stack breakdown procedure when there are trailing extra tokens 2016-06-14 20:25:33 -07:00
Max Brunsfeld
2109f0ed74 Handle allocation failures when copying tree arrays 2016-06-14 14:46:49 -07:00
Max Brunsfeld
f77c08eff5 Check for better stack versions earlier when handling errors 2016-06-12 17:41:25 -07:00
Max Brunsfeld
2b80e66188 Don't merge stack versions with different error costs 2016-06-12 17:27:08 -07:00
Max Brunsfeld
00a0939504 Abort erroneous parse versions more eagerly 2016-06-02 14:04:48 -07:00
Max Brunsfeld
1e42e68098 Handle cases where both valid and incomplete reduction paths end at the same stack node 2016-05-30 14:08:42 -07:00