-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected token ILLEGAL on regex \. #44
Comments
Note to my future self: It works on http://esprima.org/demo/parse.html so the fix should be easy to find by just doing a step by step debug session and find out the difference between esprima and esprima-dotnet. |
Forgot to mention it is from tokenizer:
|
Yes, I'm also very interested in fixing of lexer errors with regex. I described similar errors in #40 (comment). |
I can't repro this issue. I added this test and it works fine both on master and dev. Can you provide a better unit test?
|
I got it in Esprima.Sample as you can see from the call stack. [Fact]
public void ShouldRegularExpressionGH44()
{
var scanner = new Scanner(@"var isHtml = /\.html$/");
var tokens = new List<Token>();
Token token;
do
{
scanner.ScanComments();
token = scanner.Lex();
tokens.Add(token);
} while (token.Type != TokenType.EOF);
} |
@sebastienros it does not work only with |
I'm still having this issue with a regex that also contains I cloned dev branch and ran the unit test posted by JarLob above, it still failed with both his regex and mine. In my case I'm experiencing the problem via a Jint execution of a file containing the problematic regex, using Jint 3.0.0-beta-1715, with Esprima 1.0.1258 |
Here is my particular regex: var urlRegex = /(https?)\:\/\/[A-Za-z0-9\.\-]+(\/[A-Za-z0-9\?\&\=;\+!'\(\)\*\-\._~%]*)*/gi; The error occurs on this line: https://github.com/sebastienros/esprima-dotnet/blob/dev/src/Esprima/Scanner.cs#L602 when processing the first |
I can't repro these issues if I use the parser directly, or the ScanRegex method. I think that the issue is in the Esprima.Sample source that you all seem to be following. The parser does more than just call Is your intent to actually iterate over each Token of a script, like the sample is supposed to work? |
We use
in a JInt script and that triggers the error in the Esprima dependency. |
Tested above regex with latest Jint 3 using REPL and worked just fine. Maybe someone should post a complete sample with used library versions. |
Here is the simplest failing program: This is the code from the sample project with that regex, and it is throwing the error in the title.
To be clear, this script Nevermind edit: It does not happen in Jint. It only happens in Scanner. |
Tracked this down: the root cause of the issue is the JS syntax itself, more precisely, the
To sum it up, you can't (reliably) use the scanner in standalone mode when the code contains regexps. Which situation is kinda sad but it looks there's no solution to this problem. What we can maybe do to mitigate it is to allow the user to provide some best effort algorithm, which would enable tokenization in some specific cases instead of failing with error. What do you think? Would such an addition make sense? |
This code (regex with \.) triggers unexpected token:
var isHtml = /\.html$/;
The text was updated successfully, but these errors were encountered: