Max Brunsfeld
535879a2bd
Represent byte, char and tree counts as 32 bit numbers
...
The parser spends the majority of its time allocating and freeing trees and stack nodes.
Also, the memory footprint of the AST is a significant concern when using tree-sitter
with large files. This library is already unlikely to work very well with source files
larger than 4GB, so representing rows, columns, byte lengths and child indices as
unsigned 32 bit integers seems like the right choice.
2016-11-14 12:19:13 -08:00
Max Brunsfeld
c9dcb29c6f
Remove the TS prefix from some internal type/function names
2016-11-09 20:59:05 -08:00
Max Brunsfeld
4106ecda43
Remove logic for recovering from OOM
2016-11-04 09:18:38 -07:00
Max Brunsfeld
e149d94ff5
Remove generated parsers' dependency on runtime.h
2016-10-05 14:02:49 -07:00
Max Brunsfeld
e0b0e29a2b
Update parse count correctly when repairing errors & undoing reductions
2016-09-01 10:04:20 -07:00
Max Brunsfeld
7483da4184
Add push_count to stack, use it in error comparisons
2016-08-31 17:29:14 -07:00
Max Brunsfeld
0faae52132
Fix some inconsistencies in error cost calculation
...
Signed-off-by: Nathan Sobo <nathan@github.com>
2016-08-31 10:51:59 -07:00
Max Brunsfeld
52ccebbf80
Rename error_depth -> error_count
2016-08-30 09:44:40 -07:00
Max Brunsfeld
00a0939504
Abort erroneous parse versions more eagerly
2016-06-02 14:04:48 -07:00
Max Brunsfeld
ea47fdc0fe
Rework logic for when to abandon parses with errors
2016-05-29 22:36:47 -07:00
Max Brunsfeld
6535704870
Replace stack_merge_new function with two simpler functions
...
- merge(version1, version2)
- split(version)
2016-05-28 21:22:10 -07:00
Max Brunsfeld
e686478ad2
Rename stack_merge function to stack_merge_all
2016-05-28 20:24:08 -07:00
Max Brunsfeld
1e353381ff
Don't create error node in lexer unless token is completely invalid
...
Before, any syntax error would cause the lexer to create an error
leaf node. This could happen even with a valid input, if the parse
stack had split and one particular version of the parse stack
failed to parse.
Now, an error leaf node is only created when the lexer cannot understand
part of the input stream at all. When a normal syntax error occurs,
the lexer just returns a token that is outside of the expected token
set, and the parser handles the unexpected token.
2016-05-26 14:15:10 -07:00
Max Brunsfeld
88053cf723
In tests, don’t record allocations while printing debug graphs
2016-05-16 10:44:19 -07:00
Max Brunsfeld
d50f6a58cc
Abort parse versions w/ worse errors when repairing an error
2016-05-16 10:33:19 -07:00
Max Brunsfeld
22c550c9d6
Discard tokens after error detection to find the best repair
...
* Use GLR stack-splitting to try all numbers of tokens to
discard until a repair is found.
* Check the validity of repairs by looking at the child trees,
rather than the statically-computed 'in-progress symbols' list
2016-05-11 13:49:43 -07:00
Max Brunsfeld
e99a3925e0
Merge all versions created in a given reduce operation
2016-04-24 00:55:19 -07:00
Max Brunsfeld
fd4c33209e
Select ambiguous alternatives by minimizing error size
2016-04-24 00:54:20 -07:00
Max Brunsfeld
cad663b144
Consider multiple error repairs on the same path of the stack
...
This changes the API to the stack_iterate function so that you can pop
from the stack without stopping iteration
2016-04-15 21:28:00 -07:00
Max Brunsfeld
695be5bc79
Merge equivalent stacks in a separate stage of parsing
...
* No more automatic merging every time a state is pushed to the stack
* When popping from the stack, the current version is always preserved
2016-04-10 14:12:24 -07:00
Max Brunsfeld
5ba40f15ad
Rename stack heads to versions
2016-04-04 12:25:57 -07:00
Max Brunsfeld
b1a696085a
Clean up stack pop functions
2016-04-04 11:59:10 -07:00
Max Brunsfeld
2f3e92c9be
Add function for popping all nodes from the stack
2016-04-04 11:44:45 -07:00
Max Brunsfeld
91e3609fbf
Write to file directly from stack debugging function
2016-04-02 22:18:44 -07:00
Max Brunsfeld
6bce6da1e6
Store verifying flag within parse stack
2016-03-31 12:03:21 -07:00
Max Brunsfeld
e7d3d40a59
Explicitly inform stack pop callback when the stack is exhausted
...
Also, pass non-extra tree count as a single value, rather than keeping
track of the extra count and the total separately.
2016-03-10 11:51:55 -08:00
Max Brunsfeld
4348eb89d4
Expose lower stack nodes via pop_until() function
...
This callback-based API allows the parser to easily visit each interior node
of the stack when searching for an error repair. It also is a better abstraction
over the stack's DAG implementation than having the public functions for
accessing entries and their successor entries.
2016-03-07 16:09:34 -08:00
Max Brunsfeld
c0595c21c5
Halt stack pops at all error states, not just error trees
2016-03-03 11:05:37 -08:00
Max Brunsfeld
3d516aeeec
Give StackPushResult enumerators shorter names
2016-03-03 10:20:05 -08:00
Max Brunsfeld
8a13b5d120
Rename StackPopResult -> StackSlice
2016-03-03 10:16:10 -08:00
Max Brunsfeld
5a34d74702
Clean up stack
2016-02-25 21:51:39 -08:00
Max Brunsfeld
da2ef7ad35
Store trees in the links between stack nodes, not in the nodes themselves
2016-02-23 17:35:50 -08:00
Max Brunsfeld
6dd92c3abe
Add function for rendering the stack as a DOT graph
2016-02-23 00:08:55 -08:00
Max Brunsfeld
f444a715fd
Clean up tree array assertions in stack spec
2016-02-22 09:23:25 -08:00
Max Brunsfeld
b113dc8b0f
Return a TreeArray from ts_stack_pop
...
Since the capacity is now included in the return value, the buffer
can be reused in the ts_parser__accept function. Also, it's just
cleaner to use Array consistently, rather than a separate buffer
and size.
2016-02-21 22:31:13 -08:00
Max Brunsfeld
3d7df851b5
Rename Vector -> Array
2016-02-17 20:41:29 -08:00
Max Brunsfeld
6fa7eca966
Make vector struct type-safe
2016-02-17 15:30:47 -08:00
Max Brunsfeld
e90a425618
Only return one result for each revealed head from ts_stack_pop
2016-02-08 12:08:15 -08:00
Max Brunsfeld
3dde0a6f39
Handle allocation failures during parsing
2016-01-19 18:08:01 -08:00
Max Brunsfeld
7fbb628c78
Remove TreeSelectionCallback struct
...
Just make a typedef for the function type
2015-12-17 12:09:06 -08:00
Max Brunsfeld
10286f307f
Pass reference to parser in stack's tree selection callback
2015-12-08 12:21:27 -08:00
Max Brunsfeld
d2bf88d5fe
Include rows and columns in TSLength
...
This way, we don't have to have separate 1D and 2D versions for so many values
2015-12-04 20:20:29 -08:00
Max Brunsfeld
863cabc827
Don't include trailing ubiquitous tokens as children when reducing
2015-12-02 15:31:15 -08:00
Max Brunsfeld
c88e9044d5
Make stack popping more robust
2015-11-20 00:04:21 -08:00
Max Brunsfeld
6254f45c1b
Rename ParseStack -> Stack
2015-09-18 22:02:06 -07:00
Max Brunsfeld
b3d883e128
Store edits in trees, not by splitting stack
...
This allows for multiple edits per parse, though it is not exposed through
the API yet
2015-09-18 22:02:06 -07:00
Max Brunsfeld
3d0890eecf
Preserve tokens within errors
2015-06-15 15:26:06 -07:00
Max Brunsfeld
80b8a0a9fb
Rename stack_right_position -> stack_total_tree_size
...
I want to re-use the stack data structure for storing the
re-usable nodes from the previous parse tree during an edit.
In this case, the stack won't conceptually start at position
zero, so the name 'right_position' doesn't make sense.
2014-10-08 17:37:21 -07:00
Max Brunsfeld
6d37877e49
Tweak debugging output
2014-10-05 16:56:29 -07:00
Max Brunsfeld
78c5fe8e02
clang-format
2014-10-03 15:44:21 -07:00