Simplify error recovery; eliminate recovery states

The previous approach to error recovery relied on special error-recovery
states in the parse table. For each token T, there was an error recovery
state in which the parser looked for *any* token that could follow T.
Unfortunately, sometimes the set of tokens that could follow T contained
conflicts. For example, in JS, the token '}' can be followed by the
open-ended 'template_chars' token, but also by ordinary tokens like
'identifier'. So with the old algorithm, when recovering from an
unexpected '}' token, the lexer had no way to distinguish identifiers
from template_chars.

This commit drops the error recovery states. Instead, when we encounter
an unexpected token T, we recover from the error by finding a previous
state S in the stack in which T would be valid, popping all of the nodes
after S, and wrapping them in an error.

This way, the lexer is always invoked in a normal parse state, in which
it is looking for a non-conflicting set of tokens. Eliminating the error
recovery states also shrinks the lex state machine significantly.

Signed-off-by: Rick Winfrey <rewinfrey@github.com>

This commit is contained in:

Max Brunsfeld

2017-09-11 15:22:52 -07:00

• committed by

Rick Winfrey

parent 8b3941764f

commit 99d048e016

15 changed files with 327 additions and 639 deletions

									
										2

src/compiler/generate_code/c_code.cc
									
										View file
										
				@ -656,7 +656,7 @@ class CCodeGenerator {

				              add(")");

				              break;

				            case ParseActionTypeRecover:

				              add("RECOVER(" + to_string(action.state_index) + ")");

				              add("RECOVER()");

				              break;

				            default: {}

				          }

Rows
Columns

Simplify error recovery; eliminate recovery states

2 src/compiler/generate_code/c_code.cc Unescape Escape View file

2

src/compiler/generate_code/c_code.cc

View file