feat: add the ability to specify a custom decode function
This commit is contained in:
parent
e27160b118
commit
500f4326d5
10 changed files with 347 additions and 16 deletions
|
|
@ -149,9 +149,22 @@ typedef struct {
|
|||
uint32_t *bytes_read
|
||||
);
|
||||
TSInputEncoding encoding;
|
||||
DecodeFunction decode;
|
||||
} TSInput;
|
||||
```
|
||||
|
||||
In the event that you want to decode text that is not encoded in UTF-8 or UTF16, then you can set the `decode` field of the input to your function that will decode text. The signature of the `DecodeFunction` is as follows:
|
||||
|
||||
```c
|
||||
typedef uint32_t (*DecodeFunction)(
|
||||
const uint8_t *string,
|
||||
uint32_t length,
|
||||
int32_t *code_point
|
||||
);
|
||||
```
|
||||
|
||||
The `string` argument is a pointer to the text to decode, which comes from the `read` function, and the `length` argument is the length of the `string`. The `code_point` argument is a pointer to an integer that represents the decoded code point, and should be written to in your `decode` callback. The function should return the number of bytes decoded.
|
||||
|
||||
### Syntax Nodes
|
||||
|
||||
Tree-sitter provides a [DOM](https://en.wikipedia.org/wiki/Document_Object_Model)-style interface for inspecting syntax trees. A syntax node's _type_ is a string that indicates which grammar rule the node represents.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue