Commit graph

91 commits

Author SHA1 Message Date
Max Brunsfeld
2fa7b453c8 Restore external scanner's state only after repositioning lexer
Also, properly identify the leaf node with the external token state
2016-12-21 13:59:56 -08:00
Max Brunsfeld
0e595346be Make lexer log output easier to read 2016-12-09 13:33:37 -08:00
Max Brunsfeld
c4fe8ded95 Remove state argument to Lexer advance method 2016-12-05 16:36:34 -08:00
Max Brunsfeld
0f8e130687 Call external scanner functions when lexing 2016-12-02 22:03:48 -08:00
Max Brunsfeld
5332fd3418 Fix build warnings 2016-11-19 20:47:43 -08:00
Max Brunsfeld
535879a2bd Represent byte, char and tree counts as 32 bit numbers
The parser spends the majority of its time allocating and freeing trees and stack nodes.
Also, the memory footprint of the AST is a significant concern when using tree-sitter
with large files. This library is already unlikely to work very well with source files
larger than 4GB, so representing rows, columns, byte lengths and child indices as
unsigned 32 bit integers seems like the right choice.
2016-11-14 12:19:13 -08:00
Max Brunsfeld
c9dcb29c6f Remove the TS prefix from some internal type/function names 2016-11-09 20:59:05 -08:00
Max Brunsfeld
eed54d95e1 Merge branch 'master' into changed-ranges 2016-10-16 21:10:25 -07:00
Max Brunsfeld
e149d94ff5 Remove generated parsers' dependency on runtime.h 2016-10-05 14:02:49 -07:00
Max Brunsfeld
cc62fe0375 Represent Lengths in terms of Points 2016-09-09 21:11:02 -07:00
Max Brunsfeld
38241d466b Rename .read_fn, .seek_fn -> .read, .seek 2016-09-06 21:39:10 -07:00
Max Brunsfeld
096ac2d4b6 Rename ts_document_set_debugger -> ts_document_set_logger 2016-09-06 17:40:26 -07:00
Max Brunsfeld
e2ca55c918 Avoid unnecessary TSInput calls when resetting lexer within an existing chunk 2016-09-06 10:23:07 -07:00
Max Brunsfeld
4f0c83ba01 Move logic for lexical error handling outside of lexer functions
This way, less logic needs to be exposed in parser.h
2016-09-03 23:40:57 -07:00
Max Brunsfeld
1c52c30111 Fix unexpected EOF errors getting lost 2016-09-03 22:46:14 -07:00
Max Brunsfeld
38c144b4a3 Refine logic for deciding when tokens need to be re-lexed
* While generating the lex table, note which tokens can match the
  same string. A token needs to be relexed when it has possible
  homonyms in the current state.
* Also note which tokens can match substrings of each other tokens.
  A token needs to be relexed when there are viable tokens that
  could match longer strings in the current state and the next
  token has been edited.
* Remove the logic for marking tokens as fragile on creation.
* Store the reusability/non-reusability of symbols off of individual
  actions and onto the entire entry for the state & symbol.
2016-06-21 07:28:04 -07:00
Max Brunsfeld
1e353381ff Don't create error node in lexer unless token is completely invalid
Before, any syntax error would cause the lexer to create an error
leaf node. This could happen even with a valid input, if the parse
stack had split and one particular version of the parse stack
failed to parse.

Now, an error leaf node is only created when the lexer cannot understand
part of the input stream at all. When a normal syntax error occurs,
the lexer just returns a token that is outside of the expected token
set, and the parser handles the unexpected token.
2016-05-26 14:15:10 -07:00
Max Brunsfeld
a3679fbb1f Distinguish separators from main tokens via a property on transitions
It was incorrect to store it as a property on the lexical states themselves
2016-05-19 16:27:25 -07:00
Max Brunsfeld
c96c4a08e6 Use an object pool for stack nodes, to reduce allocations
Also, fix some leaks in the case where memory allocation failed during parsing
2016-02-04 11:19:42 -08:00
Max Brunsfeld
3dde0a6f39 Handle allocation failures during parsing 2016-01-19 18:08:01 -08:00
Max Brunsfeld
f2e7058ad9 Support UTF16 directly
This makes the API easier to use from javascript
2015-12-28 13:53:22 -08:00
Max Brunsfeld
da1bc038e5 Remove nested options structs in Tree 2015-12-22 14:20:58 -08:00
Max Brunsfeld
2bcd2e4d00 Reuse fragile tokens that came from the current lex state 2015-12-21 16:04:11 -08:00
Max Brunsfeld
d713054d61 Record which tokens are fragile when lexing 2015-12-10 21:05:54 -08:00
Max Brunsfeld
08d50c25ae clang-format 2015-12-04 20:56:33 -08:00
Max Brunsfeld
d2bf88d5fe Include rows and columns in TSLength
This way, we don't have to have separate 1D and 2D versions for so many values
2015-12-04 20:20:29 -08:00
Max Brunsfeld
8e217f758c Use individual args instead of TSLength in input seek function 2015-12-03 23:06:01 -08:00
Max Brunsfeld
8a146a9bef Reset lexer correctly when old input was blank 2015-12-03 10:00:39 -08:00
Max Brunsfeld
f08554e958 Replace NodeType enum with SymbolMetadata bitfield
This will allow storing other metadata about symbols, like if they
only appear as ubiquitous tokens
2015-12-02 15:10:24 -08:00
joshvera
9da4aeaeff columns start at 0 for sanity's sake 2015-11-30 17:22:47 -05:00
joshvera
cc77889d11 combine logs 2015-11-30 14:19:50 -05:00
joshvera
88d3432787 Merge remote-tracking branch 'joshvera/line-numbers' into line-numbers 2015-11-30 13:06:54 -05:00
joshvera
7633cbb836 indentation 2015-11-30 12:59:23 -05:00
joshvera
4af3b7d0fd Add offset_point to LookaheadState 2015-11-30 12:50:16 -05:00
joshvera
f5fc247c8b Merge remote-tracking branch 'origin/master' into line-numbers 2015-11-30 12:36:11 -05:00
joshvera
4cbc4b8bcf Revert "try starting from 1"
This reverts commit 11efff2442.
2015-11-30 12:16:58 -05:00
joshvera
11efff2442 try starting from 1 2015-11-25 14:25:11 -05:00
joshvera
3d9a44d880 Calculate the column and offset separately in TSNode 2015-11-25 13:36:19 -05:00
joshvera
4663b9ce89 Add padding and size points to ts_tree_make_leaf in ts_lexer__accept 2015-11-25 11:44:13 -05:00
Max Brunsfeld
7aba2a0716 Rename DEBUG macro to LOG
DEBUG is already used as the symbol to enable/disable assert() calls
2015-11-20 11:50:50 -08:00
Max Brunsfeld
64874449e4 Allow different parse stack heads to lex differently 2015-11-19 20:55:18 -08:00
joshvera
b0f6bac3ab replace start and end with padding and size 2015-11-18 16:34:50 -08:00
joshvera
a85b7fe3c4 start column at 0 again 2015-11-16 16:59:12 -08:00
joshvera
fc49a3949a try resetting to 1 2015-11-13 14:13:16 -05:00
joshvera
8058500c5b Add source info to TSTree 2015-11-12 15:32:53 -05:00
joshvera
bf666351e9 Set token_end_source_info 2015-11-12 13:28:33 -05:00
joshvera
e60ab58187 maybe increment line and column here? 2015-11-12 13:25:35 -05:00
joshvera
e720922662 Add source info to TSLexer 2015-11-12 12:24:05 -05:00
Max Brunsfeld
f5d861a019 Fix bug where ts_stack_pop results were backwards for some stack configurations 2015-10-28 12:10:45 -07:00
Max Brunsfeld
c885eea706 Add current position to lexer debug message 2015-10-26 12:47:54 -07:00