-
Notifications
You must be signed in to change notification settings - Fork 1
Parsing is slow relative to LLVM #6
Comments
Interesting, turning the garbage collector off with 70% of CPU time is spent in mapassign, as best as I can tell simply executing this line: https://github.com/mewmew/l/blob/2f2fc8d9956b16f583893cb6bf27b02311d285b2/ir/irutil/walk.go#L95 |
If I turn module resolution off (the bulk of the code in
If I turn module resolution off and set
|
Hi @pwaller, Thanks for the feedback! It's great that you've started to profile and locate some of the performance bottlenecks. Yes, map access is a common concern, and can often be solved rather easily. A similar case was identified in a hand-written lexer that was part of I think there are also several other low-hanging fruits to improve the performance of the parsing. The main one is related to the parser generated by
The goal of this project is to be roughly on pair with the performance of the official LLVM project. Perhaps ~2x slowdown is ok, but definitely not >= 10x slowdown. That being said, the immediate goal is to complete llir/llvm#29 and merge mewmew/l back into llir/llvm. Then once llir/llvm supports read/write of the entire LLVM IR language, we will switch focus to performance optimizations. Cheers, |
Currently evaluating using Textmapper instead of Gocc to generate the lexer and parser, hopefully this may help improve the performance. In inspirer/textmapper#6 (comment) @inspirer mentions the following:
The evaluation is tracked in this repo: https://github.com/mewmew/l-tm |
The grammar has now rewritten to Textmapper (no semantic actions yet though). And the initial performance results are great! As a benchmark I tried parsing the LLVM IR assembly produced when compiling Coreutils with Clang. The LLVM IR files are located at https://github.com/decomp/testdata/tree/master/coreutils/testdata And the benchmark results are reported at https://github.com/mewmew/l-tm/blob/master/BENCH.md Parsing 1,733,842 lines and 135 MB of LLVM IR assembly, as contained in the 107 source files at Notice, these results is only from shift/reduce parsing. But no semantic actions are yet used to produce the AST, resolve identifiers, do type checking, etc. That is yet to be done. But, as for parsing. I think Textmapper makes a good choice for performance. |
@pwaller as of mewspring/l-tm#1 we are now able to construct an AST for the entire LLVM IR language as defined by LLVM 7.0 (except for modules with The step to translate the AST to the IR representation is yet to be done. However, this gives us an indication of the performance we can expect. Official LLVM:
llir using parser generated by Textmapper:
To replicate these steps, do as follows: Download and install Textmapper. The tool is currently written in Java and generates parsers in Go (similar to ANTLR). @inspirer is working on porting the Textmapper tool to Go (see inspirer/textmapper#6).
go get mewmew/l-tm and generate the LLVM IR parser from the grammar using Textmapper.
Now, use
And without printouts:
|
Impressive, great work! I watch with interest. I assume this is without any kind of resolution? Do you have a plan for attacking the symbol resolution problem? I have this sense that the current code must be descending into nodes far more often than necessary. Also, because of the use of interfaces it must be doing a lot of work. I'd be interested to hack together some generated code which does this as a side project, I might have some time for that on Friday 19th, perhaps. |
Indeed, this is before resolution.
I'm open to suggestions :) How would you attack the problem? Basically, I learn as I go. Have learned a great deal about LR, LALR, Pager's LR(k), etc these last few weeks. The aim of Textmapper is more or less precisely what I was hoping for, basically ANTLR but using LR instead of LL and Go instead of Java.
Definitely. I'd love to hack on this with you. Friday the 19th, it's a date! |
@pwaller We now have type resolution of type definitions (in the translate branch). I've added a Note: I'm translating to the IR of llir/l which is basically a merge between The preliminary results are as follows: Type resolution of type definitions enabled
Type resolution of type definitions disabled
Running on all 107 LLVM IR files of CoreutilsType resolution of type definitions enabled
Type resolution of type definitions disabled
|
😮 nice! 💯 Is this now doing more or less equivalent work as when I made the original report, or are there any other major bits missing? |
Oh, no.. Much work left to do. It's a start though :) This is the list of things to do to translate the AST to IR, roughly: For entire moduleIdentifier resolution and translation in module scope.
For each functionIdentifier resolution and translation in function scope.
As you can see, there are a lot of steps. And any of these steps may become the bottle neck if implemented poorly. I'm fairly certain that was the case last time around. And given your profiling, it seems identifier resolution killed the performance by walking the AST too many times (and the AST walker tracked visited nodes to avoid infinite recursion). So, any ideas on how to better implement this is also warmly welcome. |
@pwaller would you like to participate in the future development of llir/llvm more closely? I can see from your previous involvement that you are quite persistent, which is required for a project of this size and ambition. I would be glad to invite you to the Also, do note that I don't expect you to work on this more than you want and understand that you have commitments both in personal life, work, time away from screen, etc. We all do :) So, please consider this a hobby project with a serious ambition in the long term. I would be very glad to invite you onboard, should you choose to join! Cheerful regards, Edit: and not to scare you with the big list above. I'll go ahead and implement most items in the weeks/months to come (so the above invitation was not to be interpreted as hey, I got this huge list, would you like to do it for me). I have a few projects I'm working on that would love to have this well implemented, so have to iterate a few times. |
Thanks for the very nicely written message. I am uncertain how much I'll be able to contribute, other than by doing as I am already doing. Let's have a chat on Friday! :) |
Sounds good! When may I catch you on Friday? |
I sent you a message about it on the 11th around 0900 UTC and 15th around 1700 UTC. Might they have gotten stuck in a spam filter? |
Sent a reply :) |
ref: mewspring/mewmew-l#6 (comment) Translate blockaddress to `&ir.ConstBlockAddress{Func: f, Block: &ir.BasicBlock{LocalName: local(old.Name())}}`, the point being that we create a dummy IR basic block, that will record the basic block name used in the blockaddress constant. Then, once function bodies have been translated, run `f.AssignIDs()` to assign local IDs to unnamed instruction results, basic blocks and function parameters. As far as I can tell, this would solve the dependency relationship mentioned above. Updates #6.
As a notice, this issue has been superseded by llir/llvm#34 |
We managed to achieve ~1.6x slowdown compared to the official LLVM version. This is before trying to profile and further optimize In relation, the dev branch of |
I have a 44kloc input
ir/testdata/coreutils/vdir.ll
, andgithub.com/mewmew/l/asm.Parse(input)
is taking 8 CPU-seconds (4s wall).opt -verify < ir/testdata/coreutils/vdir.ll
takes 250ms (280ms wall).The relative slowdown is ~30x.
Is performance a goal of this project? I'm concerned about what this will look like when I feed it larger inputs.
The text was updated successfully, but these errors were encountered: