-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sifting instruction encodings on ARM64, many capstone unsupported encodings discovered #2150
Comments
Using my branch is currently the best option you have. Because it will take a while until everything is merged into I'll still work on it though, so there might be some things missing (but there shouldn't be many) and I will push stuff to it. But for a simple check if a instruction decodes, it is enough. Last time I checked the whole encoding space ( Regarding your overall research: Are you aware of this PR? It adds detailed encoding of each instruction to detail (as detailed as LLVM is, which is sometimes great and sometimes meh). |
@Rot127 Thanks for the quick response! I'll start right away to implement your branch into my project, I'll let you know sometime tomorrow what the results are and if anything is remaining / issues I might have encountered. Yes I am aware of that PR, and I started to incorporate it into my work last week. Appreciate you pointing it out though! Thanks for all the hard work. Cheers |
Great! I am happy about any feedback! There hasn't been many eyes on it yet. So suggestions about improvements and issues are very welcome! |
Hi @Rot127 👋 I made a PR against your repo for some changes that were required to build the whole project on the latest ARM64 macOS, and maybe some cleanups. I'm a noob in this codebase though, so I apologize if I implemented things incorrectly. Happy to make any changes needed. So far the branch is working well 🎉
I'm going to keep this open for a little longer until I've ran my tool a couple times through. Thanks |
Any more things you needed? Otherwise we can close this. For AArch64 we come up with an update to LLVM 18 soon: #2298 |
@watbulb Close this for now. Please let me know if your find more missing instructions which were added in LLVM 18 or earlier. |
Hello,
I am working on a project to locate undefined instructions on various ARM64 processors, and attempt to attribute them to hardware.
In my code, I do a naïve masked increment to search the encoding space from
00 00 00 00
toff ff ff ff
, however, before I run the incremented mask as a instruction, I first pass the instruction to execute to capstone in-order to first check if the encoding is known by some disassembler, before attempting to execute the instruction and checking various pieces of the processor state if executed/decoded.Doing this increment, disassemble, check loop has resulted in creating a corpus of instructions that decode properly using LLVM 16.0.6 objdump, however, capstone has no knowledge of such encodings. Some of these are due to missing extension support in capstone, which is fine, I can filter and work around that. The instructions I am concerned about are instructions that are in the base ISA for Aarch64 that LLVM handles, but capstone does not.
I wanted to start a discussion here about how I should go about working with the capstone contributors here and which way would be the best to report these decoding inconsistencies. I can upload a corpus set with instructions that are not part of a extension set for Aarch64 which capstone does not decode, but LLVM does. Would this be the best way forward? Unfortunately, I'm not terribly familiar with the capstone codebase, but I'm quite familiar with
TableGen
, I'd be happy to try and diagnose this if its indeed an issue and i'm not crazy or doing something stupid 😆. I apologize if this is just a bunch of noise that will be fixed in #2026. I can also try @Rot127'sauto-sync-aarch64
branch now and report if these have been fixed, if at all helpful.Thank you!
Below I'll include a couple examples of these instructions:
LDRSB
LLVM objdump 16.0.6
cstool 5.0.1:
LDXRB
LLVM objdump 16.0.6
cstool 5.0.1:
LDTR
LLVM objdump 16.0.6
cstool 5.0.1
The text was updated successfully, but these errors were encountered: