-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance #34
Comments
This comment has been minimized.
This comment has been minimized.
As we've reached the hand-wavy ~2x slowdown target for the v0.3.0 release, I've changed the milestone to future, as we will constantly seek to improve the performance of llir. It would be really interesting also to explore using concurrency to spin up the cores of multicore machines. I tried a naive approach for this and it seems to be a valid approach for bringing wall clock run time improvements. We just need to figure out how to do it in a less naive way, and also to find the trade-off factors for when we should spin up Go-routines and how many to run. We can define threshold limits for these. Challenge: beat LLVM wall time performance using concurrencyAnyone interested in experimenting a bit, have a look at asm/translate.go which defines the high-level AST to IR translation order. In it, a few good candidates for concurrent execution have been identified: Order of translationNote: step 3 and the substeps of 4a can be done concurrently.
Note: step 3 and the substeps of 4a can be done concurrently.
Note: steps 5-7 can be done concurrenty.
|
Filed: inspirer/textmapper#29 |
Moving performance related note from
|
This issue is intended to profile the performance of the
llir/llvm
library, measure it against the official LLVM distribution and evaluate different methods for improving the performance.This is a continuation of mewspring/mewmew-l#6
The benchmark suite is at https://github.com/decomp/testdata. Specifically, the LLVM IR assembly of these projects are used in the benchmark:
Below follows a first evaluation of using concurrency to speed up parsing. The evaluation is based on a very naiive implementation of concurrency, just to get some initial runtime numbers. It is based on 3011396 of the development branch, and subsets of the following patch has been applied https://gist.github.com/mewmew/d127b562fdd8f560222b4ded739861a7
Official LLVM results
For comparison, below are the runtime results of the
opt
tool from the official LLVM distribution (usingopt -verify foo.ll
).Coreutils
SQLite
llir/llvm
resultsCoreutils
no concurrency
concurrent
translateTopLevelEntities
translateTopLevelEntities
concurrent
translateGlobals
translateGlobals
(for global and function definitions)concurrent
translateTopLevelEntities
andtranslateGlobals
translateTopLevelEntities
translateGlobals
(for global and function definitions)SQLite3
no concurrency
concurrent
translateTopLevelEntities
translateTopLevelEntities
concurrent
translateGlobals
translateGlobals
(for global and function definitions)concurrent
translateTopLevelEntities
andtranslateGlobals
translateTopLevelEntities
translateGlobals
(for global and function definitions)The text was updated successfully, but these errors were encountered: