-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect source value for Phase II(i) value computation #14
Comments
Hello. I think that there actually could be a semantic problem with using non-zero seeds. Remember that jump functions are really functions, of the form "in any context in which x holds initially, y holds at the target statement". I think if you use a non-zero seed then you are saying that the initial value only holds for certain contexts, namely those in which your non-zero seed applies. Could you maybe explain a bit more about your client analysis so that I can understand what it does and how it behaves? |
Sure. For certain functions, I want to track all possible heap allocations that may reach that function call. (For reasons specific to the analysis there is a reason I'm not simply using Soot's built into PTA and then extracting the AllocNodes...) For example, given a function f(x, 2) I want to find all heap allocations that reach x. There may be multiple invocations of f; I want to find not simply which allocations reach an invocation of f, but which allocations reach which invocations of f. For that reason, I assign a unique identifier to each invocation; the values computed along edges are taken from the powerset of all identifiers. The initial seeds for this backwards-may analysis are the arguments of interest at the each invocation of f. So for example, if there are two invocations Upon further digging it appears that initial seeds (n,d) that are not of the form (sP, zero) are simply seeded into the analysis as jump functions from (sP, zero) -> (n,d) with the corresponding path function being the identity function. Thus, a jump function never originates from (n,d), however the Phase II(i) computation appears to assume that the initial seeds are the source of jump functions; I believe that this mismatch of assumptions causes the error. |
Hi @jtoman @johspaeth or @johanneslerch should also be able to best comment on your assessment. @johanneslerch can you comment on this? I think you are currently looking in this code anyway... |
I agree, this looks like a bug to me as well. The initial seed edges are propagated using It is questionable if the initial edge should be If choosing the former case, then in addition
the computation should start for the zero value:
@ericbodden which version do you prefer? An initial edge from zero to the custom specified seed value or a self-loop edge using only the specified seed (i am not sure if the latter case may cause other assumptions to fail) @jtoman can you try the sketched bug fix and check if it solves your problem? I have some general remarks addressing your analysis design. I am not sure if scalability is an issue in your case, but if you encounter problems here you might want to rethink if it is a good choice to have different facts D for each initial seed. Different facts will never allow you to reuse computed summaries. Basically, your approach is as good as running the analysis for each initial seed in isolation. I immediately see two alternatives allowing you to reuse summaries.
|
Hi @johanneslerch, as you suggested, my analysis is actually an IDE problem, the IDE values L are what track invocation sites and the facts D are access paths that are invocation site agnostic. I will try the sketched bug-fix later today. |
I believe I've run across a bug in the implementation of Phase II(i) in IDESolver. The problem seems arise when a fact besides the zero fact is used for an initial seeds.
The problem occurs in
propagateValueAtStart
:In this case,
d
is a non-zero fact that was an initial seed. The forwardLookup call returns that there are no destination (fact,function) pairs for the source factd
. This is because most (all?) jump functions have the zero fact as their source fact. The analysis seems to work if I explicitly replace the second element of the tuple passed topropagateValueAtStart
with the zero fact, but this fix is a hack at best and I'm not sure if it is correct in general.The text was updated successfully, but these errors were encountered: