Skip to content

Unwind using GOTO unwinder #161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
May 30, 2022

Conversation

FrNecas
Copy link
Contributor

@FrNecas FrNecas commented Apr 20, 2022

This PR introduces a new unwinder which unwinds on GOTO level, makes necessary transformations (mainly related to dynamic object handling) and then computes a new SSA with correct points-to analysis. So far, we have not managed to integrate this solution with incremental SAT solving but this already shows promising results.

The GOTO unwinding itself is slower due to not making use of incremental SAT solver. To overcome this, the current implementation uses both the old unwinder (by default) and then the new unwinder only in cases where dynamic objects are used with unwinding (or k-induction). Such approach combines the best of both worlds -- it is sound when the old approach was not but it is quick where the overhead would not bring any advantages. This is achieved by the unwinders implementing a general interface.

Experiments show promising results so far, we've compared the old and new solution on a subset of SV-COMP benchmarks (most of reach-safety was included, the whole memsafety and part of termination) with a 5 minute timeout:

Category Correct Tasks (old) Incorrect Tasks (old) Score (old) Correct Tasks (new) Incorrect Tasks (new) Score (new)
reach-safety 776 2 1331 814 1 1420
memsafety 92 3 83 143 6 96
termination 164 0 280 164 0 280

All of these improvements come from proper use of k-induction in benchmarks with dynamic memory. The 3 new failing tests in memsafety are a bit unfortunate, these are:

  • memsafety-ext/tree_dsw
  • memsafety-ext/tree_of_cslls
  • list-ext3-properties/dll_circular_traversal

In order to make memory leak analysis sound with unwinding, I had to introduce a workaround which duplicates assignments to __CPROVER_memory_leak variable for each malloc/free. I initially expected that some of the new failing benchmarks will be caused by the same problem but with __CPROVER_deallocated so I tried implementing a "split" of this variable as well (present on deallocated-fix branch on my fork), however it introduced more incorrect results and slowed 2LS down significantly. I assume there will be differrent reasons for the 3 incorrect benchmarks, haven't figured it out just yet.

This PR also re-labels some of our old regression tests since they now work properly with the new unwinder and adds some new tests where the upsides of this solution are demonstrated.

FrNecas added 8 commits April 13, 2022 14:17
The possibility of unwinding only a single loop is never utilized, all
loops in all functions are always unwound in 2LS. Removing the
per-loop tracking simplifies the interface of the unwinder which will
make it easier to integrate switching with GOTO unwinder.

Signed-off-by: František Nečas <[email protected]>
They won't be implementable in the new unwinder anyway and most likely
are not needed.

Signed-off-by: František Nečas <[email protected]>
Signed-off-by: František Nečas <[email protected]>
Signed-off-by: František Nečas <[email protected]>
This class won't be used by the GOTO unwinder and hence it doesn't make
sense to have it in the public interface for both of these classes.

Signed-off-by: František Nečas <[email protected]>
Signed-off-by: František Nečas <[email protected]>
This is necessary since summary checker initializes the unwinder and the
GOTO unwinder needs access to the GOTO model.

Signed-off-by: František Nečas <[email protected]>
@peterschrammel peterschrammel self-requested a review April 20, 2022 22:25
Copy link
Member

@peterschrammel peterschrammel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, František, that's a great piece of work. The code is very clean and easy to follow.
I like the solution of falling back to the old unwinder. This is great for now.
However, going forward, the idea I had back then was to find a mapping scheme that allows us to cache most of the SSA (and hence the formula already pushed into the solver instance) when adding an iteration, thus separating the unwinding process from the SSA/solving layers and enabling incremental solving despite transformations on the GOTO level (unwinding, inlining, etc).

@FrNecas
Copy link
Contributor Author

FrNecas commented May 2, 2022

Thanks for finding time for a review, Peter! We have already discussed how we could integrate incremental solving, our idea was however quite different from your proposed approach (at least as far as I can understand from your description) so it would be nice to discuss the approach to go with in the future. I see this PR as a more of a transition state, it demonstrates the advantages of GOTO unwinding and improves capabilities of 2LS, though it's far from ideal.

Copy link
Collaborator

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A great piece of work! Looks good, just few nits and questions.

FrNecas and others added 18 commits May 25, 2022 21:58
Signed-off-by: František Nečas <[email protected]>
This is necessary to generate correct SSA for cases like:
    assert(p->x->y != NULL);
Previously, the check didn't work correctly for multiple consecutive
assignments of type i = i->next (which could be introduced by
unwinding). However, if an object was concretized in a location and then
the pointer was overwritten, the concretization is no longer valid.

Signed-off-by: František Nečas <[email protected]>
Signed-off-by: František Nečas <[email protected]>
Since the unwinding may (and in most cases will) generate new dynamic
objects, the semantics of the unwound assertion changes from the
previous unwinding and hence such constraints would no longer be valid.

Signed-off-by: František Nečas <[email protected]>
The names of dynamic objects are "hardcoded" in malloc_ssa and unwinding
doesn't update the names, the location inside it is kept intact. We need
to manually rename the objects in order for the unwinder to work with
dynamic objects.

Signed-off-by: František Nečas <[email protected]>
Such BMC is now correctly supported thanks to unwinding GOTO, however
the tests needed some adjustments -- some undefined behaviour was
present, the expected assertion results were not correct.

Signed-off-by: František Nečas <[email protected]>
Signed-off-by: František Nečas <[email protected]>
Signed-off-by: František Nečas <[email protected]>
The way CPROVER instruments programs for checking memory leaks was not
sufficient for 2LS's verification approach paired with unwinding. The
instrumented output utilized a single __CPROVER_memory_leak variable
and it consisted of 4 logical parts:
 - initialization to NULL at the beginning
 - setting it during malloc (CPROVER prefix is stripped):
	memory_leak = record_leak ? malloc_value : memory_leak
 - resetting it during free:
	IF !(__CPROVER_memory_leak=free::ptr) GOTO 1
	  ASSIGN __CPROVER_memory_leak=NULL
     1: ...
 - assert at the end that __CPROVER_memory_leak is NULL

The usage of just one variable lead to a problem in cases where there
are multiple allocation sites. The variable would keep track of just one
of the pointers/objects from these allocation sites so in some cases
freeing a single pointer and leaving the rest leaking would result in a
successful analysis.

Overcome this problem by splitting the __CPROVER_memory_leak variable
into multiple ones (one for each allocation site) at the beginning of
analysis and then after each unwinding (as each unwinding can introduce
a new allocation site).

Signed-off-by: František Nečas <[email protected]>
@FrNecas FrNecas force-pushed the frnecas-unwind-polymorph branch from d7d6edf to 66b0ddd Compare May 25, 2022 20:00
@FrNecas
Copy link
Contributor Author

FrNecas commented May 26, 2022

Thanks for the reviews, all comments should hopefully be addressed.

@viktormalik viktormalik merged commit 33e3260 into diffblue:master May 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants