Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add locations to the AST #21

Open
Chris00 opened this issue Sep 4, 2013 · 11 comments
Open

Add locations to the AST #21

Chris00 opened this issue Sep 4, 2013 · 11 comments
Labels

Comments

@Chris00
Copy link
Contributor

Chris00 commented Sep 4, 2013

Adding locations to the AST would be very useful to report errors. say for example you want to run some code but it fails. Without locations, it is difficult to report a nice error message to the user.

@pw374
Copy link
Contributor

pw374 commented Sep 4, 2013

Can you be more specific? I mean, there is no notion of error in Markdown...

@Chris00
Copy link
Contributor Author

Chris00 commented Sep 4, 2013

It is not for Markdown processing in itself. It is for what you do once the markdown has been processed (e.g. check that the code is correct).

@pw374
Copy link
Contributor

pw374 commented Sep 4, 2013

Ok, I understand that basically, for a node in the AST, you want to be able to know which Markdown expression that led to it. I agree that it would be nice to have this feature.

I'm currently thinking that this might need a lot of code refactoring... I have to think more.
(The first problem that I see is that we'd need good locations for the token list manipulated by the parser. When it does out of the lexer, the list is certainly correct, but after a while, it's likely that the parser would have messed with the list and made weird. So the locations would have to be robust to the parser's manipulations. What I mean is that I think we cannot easily retrieve locations just by looking at the current tokens list, whereas if the parser was nicer with this list, we'd just have to read the list...)

If you have suggestions on how to do it, don't hesitate ;)

@Chris00
Copy link
Contributor Author

Chris00 commented Sep 4, 2013

There is more. Since we intend to pre-process some files, an annotation like the # in OCaml must be supported to be able to add HTML code without loosing the locations (think about the 99problems page on the web site, the fact that the answers are hidden requires to wrap them with HTML code but, if the solution contains an error — say a syntax problem — then we want to tell the author using the locations in the original file).

@pw374
Copy link
Contributor

pw374 commented Sep 4, 2013

Yes of course, if we know where we come from (at the AST level), then it's easy to produce the locations, in say HTML comments or something more hack-ish like empty span tags. This could be for each block (e.g., paragraph, blockquote, ...), for instance...

@darioteixeira
Copy link

I've added Markdown support to Lambdoc via OMD (see here for the code). And indeed, the major problem is still the lack of location information. This is necessary in Lambdoc for two main reasons: first, because Lambdoc allows customisable feature sets (you may want to forbid your users from formatting text as bold, for a silly example); second, because not all OMD features are present in Lambdoc (nesting beyond H3, for instance). In both cases I need to know the line number where the offending input occurred, so I can present a user-friendly error message.

@nojb
Copy link
Contributor

nojb commented Jun 20, 2020

I am not a 100% sure, but this may make the cut for 2.0.

@shonfeder
Copy link
Collaborator

I think the discussion on #223 is relevant here (arguably #223 is a generalization of this issue).

iiuc, if we start putting line numbers tracking certain token were taken from during parsing, then we are definitely not constructing an AST any more. We are building a parse tree.

It seems clear there are a lot of users who would benefit from having access to a detailed a parse tree of the markdown!

@shonfeder
Copy link
Collaborator

I've suggested an approach to accommodate this kind of feature while still producing an AST for higher-level uses here: #223 (comment)

@shonfeder
Copy link
Collaborator

@sonologico found a better approach for an AST that can support this in #234, so I think we have a good way forward.

The next step will be to work out a sensible way of working this kind of additional information into the parsing routine.

@shonfeder shonfeder removed this from the 1.1.0 milestone May 29, 2021
@artempyanykh
Copy link

As another use-case, having location information would be great for LSP servers too.

For context, I'm talking about this kind of LSP server for Markdown. I wrote it in Rust but after some time found the ceremony around ownership/borrowing rather exhausting and now am thinking about rewriting it in either OCaml or F#.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants