Phil Turnbull
5ef9c4d6aa
Increase size of ref_counts and add assertions
...
This bumps the size of the reference counts from 16- to 32-bit counters to make
it less likely to overflow. Also assert in the retain function that the
reference count didn't overflow.
32-bits seems big enough for non-pathological examples but a more fool-proof
fix may be to bump it to 64-bits.
2017-06-29 06:39:01 -07:00
Max Brunsfeld
dfd7b1f5f6
Consolidate memory management logic in Stack
2017-06-27 16:17:24 -07:00
Max Brunsfeld
0143bfdad4
Avoid use-after-free of external token states
...
Previously, it was possible for references to external token states to
outlive the trees to which those states belonged.
Now, instead of storing references to external token states in the Stack
and in the Lexer, we store references to the external token trees
themselves, and we retain the trees to prevent use-after-free.
2017-06-27 14:54:27 -07:00
Max Brunsfeld
f678018d3d
Avoid use-after-free when copying stack iterators
2017-06-27 14:54:27 -07:00
Max Brunsfeld
7b401de5a6
Don't use pointer equality to compare external token states
2017-05-03 09:57:09 -07:00
Max Brunsfeld
df520635c6
Prevent crash due to huge number of possible paths through parse stack
2017-02-20 14:34:10 -08:00
Max Brunsfeld
dc6598e07e
Include external token states in stack debug graphs
2017-01-30 21:58:27 -08:00
Max Brunsfeld
36608180d2
Store external token states in the parse stack
2017-01-08 22:06:05 -08:00
Max Brunsfeld
d57043b665
Add ability to store external token state per stack version
2017-01-04 21:22:23 -08:00
Max Brunsfeld
e7217f1bac
Clean up some methods in parser.c
2016-11-14 17:25:55 -08:00
Max Brunsfeld
535879a2bd
Represent byte, char and tree counts as 32 bit numbers
...
The parser spends the majority of its time allocating and freeing trees and stack nodes.
Also, the memory footprint of the AST is a significant concern when using tree-sitter
with large files. This library is already unlikely to work very well with source files
larger than 4GB, so representing rows, columns, byte lengths and child indices as
unsigned 32 bit integers seems like the right choice.
2016-11-14 12:19:13 -08:00
Max Brunsfeld
c9dcb29c6f
Remove the TS prefix from some internal type/function names
2016-11-09 20:59:05 -08:00
Max Brunsfeld
ca45acd6af
Suppress 'value computed is not used' warning on gcc
2016-11-05 21:23:03 -07:00
Max Brunsfeld
4106ecda43
Remove logic for recovering from OOM
2016-11-04 09:18:38 -07:00
Max Brunsfeld
eed54d95e1
Merge branch 'master' into changed-ranges
2016-10-16 21:10:25 -07:00
Max Brunsfeld
e149d94ff5
Remove generated parsers' dependency on runtime.h
2016-10-05 14:02:49 -07:00
Max Brunsfeld
cc62fe0375
Represent Lengths in terms of Points
2016-09-09 21:11:02 -07:00
Max Brunsfeld
d31934ac77
Avoid potential use after free in stack__iter
2016-09-05 21:41:33 -07:00
Max Brunsfeld
e0b0e29a2b
Update parse count correctly when repairing errors & undoing reductions
2016-09-01 10:04:20 -07:00
Max Brunsfeld
71fdd9aa49
Remove error_depth tracking from the stack
2016-08-31 17:30:16 -07:00
Max Brunsfeld
7483da4184
Add push_count to stack, use it in error comparisons
2016-08-31 17:29:14 -07:00
Max Brunsfeld
0faae52132
Fix some inconsistencies in error cost calculation
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-08-31 10:51:59 -07:00
Max Brunsfeld
1d0f6c3cc0
Rename error_size -> error_cost
2016-08-30 11:09:12 -07:00
Max Brunsfeld
52ccebbf80
Rename error_depth -> error_count
2016-08-30 09:44:40 -07:00
Max Brunsfeld
31d1160e21
Base error costs on top-level trees skipped and lines of text skipped
...
Rather than on the total number of tokens skipped
2016-08-29 17:06:23 -07:00
Max Brunsfeld
27c9cb4175
Show anonymous tokens in quotes in dot graphs
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-08-29 16:39:14 -07:00
Max Brunsfeld
8c26d99353
Store error recovery actions in the normal parse table
2016-06-27 14:07:47 -07:00
Max Brunsfeld
9538b5b879
Don't count extra trees toward stack versions' error costs
2016-06-26 22:46:40 -07:00
Max Brunsfeld
76975f56ec
Display node positions in tooltips in debugging graphs
2016-06-15 11:06:00 -07:00
Max Brunsfeld
f69d709650
Remove unused functions
2016-06-15 10:17:54 -07:00
Max Brunsfeld
ecc7399ed3
Fix stack breakdown procedure when there are trailing extra tokens
2016-06-14 20:25:33 -07:00
Max Brunsfeld
2109f0ed74
Handle allocation failures when copying tree arrays
2016-06-14 14:46:49 -07:00
Max Brunsfeld
f77c08eff5
Check for better stack versions earlier when handling errors
2016-06-12 17:41:25 -07:00
Max Brunsfeld
2b80e66188
Don't merge stack versions with different error costs
2016-06-12 17:27:08 -07:00
Max Brunsfeld
00a0939504
Abort erroneous parse versions more eagerly
2016-06-02 14:04:48 -07:00
Max Brunsfeld
1e42e68098
Handle cases where both valid and incomplete reduction paths end at the same stack node
2016-05-30 14:08:42 -07:00
Max Brunsfeld
ea47fdc0fe
Rework logic for when to abandon parses with errors
2016-05-29 22:36:47 -07:00
Max Brunsfeld
6535704870
Replace stack_merge_new function with two simpler functions
...
- merge(version1, version2)
- split(version)
2016-05-28 21:22:10 -07:00
Max Brunsfeld
e686478ad2
Rename stack_merge function to stack_merge_all
2016-05-28 20:24:08 -07:00
Max Brunsfeld
1e353381ff
Don't create error node in lexer unless token is completely invalid
...
Before, any syntax error would cause the lexer to create an error
leaf node. This could happen even with a valid input, if the parse
stack had split and one particular version of the parse stack
failed to parse.
Now, an error leaf node is only created when the lexer cannot understand
part of the input stream at all. When a normal syntax error occurs,
the lexer just returns a token that is outside of the expected token
set, and the parser handles the unexpected token.
2016-05-26 14:15:10 -07:00
Max Brunsfeld
7c859a07bb
Allow null trees in the stack
2016-05-26 13:34:57 -07:00
Max Brunsfeld
a3679fbb1f
Distinguish separators from main tokens via a property on transitions
...
It was incorrect to store it as a property on the lexical states themselves
2016-05-19 16:27:25 -07:00
Max Brunsfeld
77e0e4bb16
Fix some leaks after allocation failures occur
2016-05-16 10:53:31 -07:00
Max Brunsfeld
88053cf723
In tests, don’t record allocations while printing debug graphs
2016-05-16 10:44:19 -07:00
Max Brunsfeld
d50f6a58cc
Abort parse versions w/ worse errors when repairing an error
2016-05-16 10:33:19 -07:00
Max Brunsfeld
22c550c9d6
Discard tokens after error detection to find the best repair
...
* Use GLR stack-splitting to try all numbers of tokens to
discard until a repair is found.
* Check the validity of repairs by looking at the child trees,
rather than the statically-computed 'in-progress symbols' list
2016-05-11 13:49:43 -07:00
Max Brunsfeld
9d247e45b2
Deemphasize extra trees in stack debugging graphs
2016-05-01 15:24:50 -07:00
Max Brunsfeld
31f6b2e24a
Refactor construction of out-of-context states
2016-04-25 21:59:40 -07:00
Max Brunsfeld
e99a3925e0
Merge all versions created in a given reduce operation
2016-04-24 00:55:19 -07:00
Max Brunsfeld
fe74c6fb34
Explicitly mark stack versions in debugging graphs
2016-04-24 00:54:59 -07:00