-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
building IRFuzzer #49
Comments
Hi John! Thanks for trying our tool! Unfortunately, I'm travelling this week. I will get back to you next week. |
no rush. |
I heard that from Yoyo in LLVM Developer's Meeting 2022. Validating backend is way harder than finding ICE, we are pushing that direction as well. Maybe we should schedule a talk or smt :) |
that sounds great, let's talk sometime in March |
I was thinking a little more about how we could work together. one way is we build your fuzzer and then have it call our tools. but, also, if you want to run our translation validation tool, then that is easy too. I would think it would just plug into your workflow pretty seamlessly. basically you just hand it an IR function and it either tells you that it verified or else signals some sort of error. I'd be happy to help you get started using our software if you wanted to do that. it's all easy stuff. all I would ask is that you let us know about any miscompiles that you find, since we're still accumulating evidence that our stuff works well, in preparation for writing it up for publication. so far we've found and reported 29 silent miscompiles in LLVM's AArch64 backend. I'm sure there are a lot more bugs remaining but we need fuzzer magic to discover them. |
We have found some bugs, still deduplicating/ping pointing them. Our experiment runs on X86 backend, so we can compile it against our host machine directly without the need of an emulator, run it, and see the difference between O0 and O3. |
git clone https://github.com/SecurityLab-UCD/IRFuzzer.git -b irfuzzer-0.3
cd IRFuzzer
./init.sh
./build.sh I separated the initialization and building process. Please try this to reproduce the results we had in our paper. Besides, I don't think IRFuzzer 0.3 is a good fit for your need. IRFuzzer is known to generate UB (e.g., Internally, we have a verify-asm branch that is customized to generate only defined behavior for that purpose. We should definitely talk about your needs, and maybe we can customize them for you as well. :) |
hi Peter, thanks!! I'll try this out soon. my testing workflow is based on Alive2 and it is extremely robust with respect to undefined behavior. even so, I think it would be best if you avoided undef because it is going to go away and also the LLVM people are very reluctant to fix miscompiles that are triggered by undef. but poison and immediate UB are perfectly fine. |
well, it got pretty far this time, here's what I ended up with on an x86-64 machine running Ubuntu 22.04:
|
(the build above was using the system compiler, gcc 11.4.0. before that, I tried building using clang 17, but in that case it didn't get nearly as far) |
anyway, I'm happy to wait for IRFuzzer 0.4 like I said, I do not require UB-free code, I just need functions in LLVM IR that can be processed by the AArch64 backend. if |
As of IRFuzzer 0.4, I think I will trim off some features just so IRFuzzer can direct rely on LLVM instead of our customized LLVM so that updating would be easier. (I think calling it IRFuzzer-Alive2 would be more appropriate lol). I'm bogged down by ICSE 25, job hunting and some other projects, I don't have an estimated time to finish, but you can expect it by next Sunday I guess. |
hi Peter, thanks! |
@regehr branch irfuzzer-alive is available now. Since you mentioned you are testing AArch64, this branch only compiled that backend (You can modify Looking back at your comment about |
thanks Peter, I'll try this out! here's a miscompile I reported last night: but this was the only one found after running our fuzzer for a couple of days on a pretty big machine. I'm afraid our fuzzer is running out of bugs to find -- hopefully yours isn't :) |
regarding undef, it would be best if you completely avoided generating undef. LLVM is trying to eliminate undef from the IR, and in the meantime people aren't really fixing bugs that contain undefs. so it's best to just not even try to find those bugs (which totally exist). regarding poison and immediate UB, you should feel free to generate tests that contain these. however, you want to avoid generating tests that are undefined on all paths, these are useless for finding miscompilation bugs (because they can be lowered to anything). |
ok, I'm doing a docker build, but this also failed, it ended with this:
|
just to be clear, I checked out the head of the irfuzzer-alive branch and then ran:
|
I messed docker a bit. Finally sorted it out. Can you try it again |
excellent-- this works for me now! let's try to figure out the next steps. first, I think that I should modify your dockerfile so that arm-tv (our fork of Alive2 that does translation validation for the AArch64 backend) gets built too. alas, this is going to require a third LLVM build since the one I need is top-of-tree + exceptions and RTTI. I'll also build a Z3 since we like to use the latest release and the ubuntu package is a few releases old. does it sound good if I give you a pull request that modifies your Dockerfile to do these things? second, we need to figure out how to run arm-tv on every IR function that you generate, and to log the results. I don't know the best way to fit that into your infrastructure, so maybe you can work on that part after I do the other thing? |
Yes, giving me PR is perfectly fine. I'll try to review them ASAP
My question is, do you want to run it online with IRFuzzer, or run it offline when IRFuzzer is done? You could've run IRFuzzer and wait until it finishes, grab all the seeds it generated and run arm-tv somewhere else. If that's the case, you don't have to modify Dockerfile at all, just run the scripts and copy the result out of the container should be fine. However, it would be interesting (in terms of research) if you can provide some kind of feedback to IRFuzzer to guide its mutation. |
I'm surprised that a miscompile can be as simple as this -- and the current unit tests couldn't catch it. Maybe I should run IRFuzzer again to see if we get more bugs (Last large scale run was almost a year ago) |
ok, let's start with the simplest thing -- I'll run IRFuzzer for a while and grab the seeds and run them through arm-tv, and I'll report back here with what I learned. if the initial results look promising, it seems perhaps worth figuring out how to run arm-tv inside the fuzzing loop, if nothing else to make the workflow smoother and avoid having to work with giant directories full of IR files. in terms of providing additional feedback to AFL++, that's a very interesting question. the only thing I can think of offhand is to use arm-tv as a source of coverage feedback, instead of the LLVM backend. presumably this would not be too difficult. I don't have any idea what the results would look like, but it would seem worth trying out if it's not too hard. |
Per discussion in SecurityLab-UCD/IRFuzzer#49, generating undef during fuzzing seems to be left fruitful. Let's eliminate undef in favor of poison unless the user explictly asked for it. Signed-off-by: Peter Rong <[email protected]>
Per discussion in SecurityLab-UCD/IRFuzzer#49, generating undef during fuzzing seems to be less fruitful. Let's eliminate undef in favor of poison unless the user explicitly asked for it. Signed-off-by: Peter Rong <[email protected]>
Just updated this branch to include my commit to llvm. You may |
Here's one easy but ideally effective idea, only instrument subclass of |
@regehr Just checking how's IRFuzzer working for you? |
hi @DataCorrupted! it was going just fine up until I got interrupted by a bunch of complications that are now mostly behind me. I'll get back to this within the next week or two!!! |
Sounds good. |
hi Peter, I came back to this but I think I'm going to need more explicit instructions about what to do. as I said, running "docker build ." succeeds for me, but when I try to run the fuzzing script I just get:
|
actually, maybe there's an easier solution here, where we can get some results without running each other's software. Peter would you mind sending me the final output of a week-long (or however long) run of IRFuzzer on the AArch64 backend (with or without global isel, or both) and I'll pass these through arm-tv and let you know what the results are? |
thanks! I can work with this. will let you know the results. |
ok-- arm-tv is running on my 128-thread machine on all of the bitcode files in your artifact. will let you know if anything good comes up! unrelated, something that might be interesting for you is Kostya's new fuzzer, Centipede: https://github.com/google/fuzztest/tree/main/centipede something complex like an LLVM backend seems like a perfect target for this fuzzer. see the slide deck linked to the project's README.md |
arm-tv has signaled some wrong code bugs, here's the summary of the ones I've looked at so far:
so anyway, that's interesting and fun but no real miscompiles so far. but it has only gone through a couple percent of the IR files so far, and I'll keep looking. |
I had my eyes on Centipede, but never had time to dig into it because I'm running multiple things in parallel I simply run out of bandwidth :( In terms of your finding, that's expected for IRFuzzer, it only cares about semantic correctness (i.e., the code should compile), and often generate code that doesn't make sense. Such behavior can be adjusted if you modify the source code. All in all, sounds very interesting! Looking forward to more exciting findings! |
update: I ran every bitcode file in the then I started a new run, this time using global isel. it has not finished, but a miscompile already popped up. Peter, I am going to go ahead and just CC you in any bug that I report, that comes from IRFuzzer-generated code. does that sound like a good plan? all of the testing that I'm doing is using the generic AArch64 backend. we've not yet even started to look at the more specific AArch64 targets or target options. link to bug: |
ok, the global isel run finished without finding any further big problems. I still have plenty of alarms to look through, but most of them appear to correspond to known weaknesses in our pointer support (this is hard stuff) that we'll hopefully get fixed this summer. at that point I can run all of your test cases again. at one level it's good news that all of these tests resulted in just one (obvious) miscompile, it means that this backend is pretty strong right now. but on the other hand, I'm surprised that a new fuzzer doesn't turn up more stuff. I'm not really sure what to do next, let me know if you have ideas. |
|
I think I used the head of that branch. But if you don't mind, I'd be happy to keep working in our current mode where you run your tool and I run mine! This seems very easy for both of us. But also I'm happy to keep trying to build+run IRFuzzer. Another thing we should investigate is whether IRFuzzer should run arm-tv. Maybe the custom mutator could fork a process for arm-tv? My thinking is that it would only do this some of the time, like when new coverage is found, or else probabilistically. We don't want to run arm-tv too much since it's slow and we don't want to take all CPU time away from fuzzing. So we could maybe adjust the probability so that arm-tv is always using, for example, fewer than 25% of the cores.
I think this is worth trying. If you need to sink a value, maybe just combine it somehow with the function's return value? |
(I tagged the wrong link to irfuzzer-alive image) |
I didn't fully get your point on arm-tv. Shall we schedule a zoom to sort that out? Also, I want to understand more about how limited is alive2 in terms of pointer to better modify the mutator. |
hi Peter, yes, it would be great to talk. now that the semester has ended my schedule is pretty open. maybe suggest a time next week? it's not Alive2 that is limited in pointers, but my tool arm-tv which builds on Alive2 to do translation validation for the AArch64 backend. the fundamental problem is that the ARM assembly code freely mixes pointers and integers, so when I lift that code back to LLVM IR, there are a ton of ptrtoint and inttoptr instructions, and these are fundamentally difficult to reason about. basically we're running into open research problems here. Nuno has implemented lots of partial solutions for me, but problems remain. we hope to get them solved this summer. |
Sounds good. |
oops I have a meeting right at 2pm on the 7th, can we do an hour earlier? regarding arm-tv, I'm just saying that it needs to get run somewhere, and I think it's more elegant and efficient to run it as part of the fuzzing process, rather than running it as a batch job afterwards. in this mode it will not participate in fuzzing actively, it simply acts as a passive test oracle, logging errors into a file somewhere. |
Just rescheduled. I don't think it would be too difficult, nothing that can't be done with some scripting. But I do need to learn how to use arm-tv first. |
excellent. I'll give you an arm-tv demo when we talk. also, our own fuzzer has gone for >24 hours without finding any miscompiles. if it finishes its run (in 2-3 days) without finding any, that would be the first time this has ever happened. so maybe between our two fuzzing efforts (+ whoever else is doing this kind of work)( we've mined out most of the easy stuff from the AArch64 backend |
hello, I'd like to try out IRFuzzer but when I run
./build.sh
I'm getting an error, below.also, is there any version of IRFuzzer that works with LLVM top of tree? all of the fuzzing that I do is against the latest version. thanks!
The text was updated successfully, but these errors were encountered: