Skip to content

Conversation

DiuDiu777
Copy link

This PR adds the tool description and workflow for RAPx.

Since we plan to try challenge 23 for verifying Vec, we use the vec::into_raw_parts_with_alloc API as an example in this PR. It demonstrates how to annotate the target function and mark relevant unsafe callees with safety tags for RAPx verification.

Related issue: Add Tool: RAPx

@DiuDiu777 DiuDiu777 requested a review from a team as a code owner September 11, 2025 14:55
@btj
Copy link

btj commented Sep 14, 2025

Hi @DiuDiu777 ! Very excited to see another participant entering the contest.

I'm trying to understand your approach. In particular, I'm reading the Unsafe Code Audit chapter of the RAPx Book. Could it be that there are some typos in the code examples?

  • Function f calls SecretRegion::from(p, 0) but p is of type *mut u32 and from requires a Vec<u32>?
  • What's the point of the len field? It's never read?
  • In the annotations for offset:
    • Should it be InBound instead of InBounded?
    • Is a precondition saying that isize::MIN <= count * size_of::<T>() missing?
  • Annotations for xor_secret_region:
    • Are these automatically generated or manually annotated?
    • self is of type &SecretRegion, right? That struct's fields are buffer, of type Vec<u32>, and len, not buffer, of type *mut u32, and size, as this listing seems to assume?
    • region should be self?

Is there a paper arguing the soundness of your unsafe code audit approach with respect to a formal semantics of the programming language (such as I do in Featherweight VeriFast)? I found this paper but it does not seem to make any soundness claims.

@btj
Copy link

btj commented Sep 14, 2025

Also, your example in the book is still somewhat complex. It would be very helpful, I think, if you could do a similar walkthrough for Vec::into_raw_parts_with_alloc, which seems even simpler. How exactly does your tool verify this function? Many thanks in advance!

@DiuDiu777
Copy link
Author

Hi, @btj . Thank you so much for your careful review and thoughtful feedback!

Regarding your questions:

  • Function f calls SecretRegion::from(p, 0) but p is of type *mut u32 and from requires a Vec<u32>?
  • What's the point of the len field? It's never read?
  • The code example is primarily meant to illustrate annotation and verification logic, so some parts may not be fully realistic.
  • In the annotations for offset:
  • You're right, there are typos in the annotations and we'll correct those.
  • Are these automatically generated or manually annotated?
  • The xor_secret_region annotations are manually added, similar to how we handle standard library functions to be verified.
  • self is of type &SecretRegion, right? That struct's fields are buffer, of type Vec<u32>, and len, not buffer, of type *mut u32, and size, as this listing seems to assume?
  • region should be self?
  • The struct fields should indeed align with the definition – we'll update the example to use buffer.

As for the soundness and complexity of examples, we're currently refining the document to better explain soundness guarantees and simplify examples where possible. Thanks again for your valuable input!

@hxuhack
Copy link

hxuhack commented Sep 29, 2025

Hi @btj , thanks for reviewing our work. I have written a short article that explains why our core methodology, tracing-based verification, is sound.

@btj
Copy link

btj commented Sep 30, 2025

Hi @hxuhack , many thanks for the article. However, I had hoped that it would clarify the key concepts of your approach, such as unsafety propagation graph, object flow edge, basic unit, audit unit, safety property, dominated graphs, contractual invariant states, operational trace states, vulnerable paths, constructor analysis, and method sequence analysis. It would be very useful to see a formal definition of (simplified versions of) these concepts, and a formal definition of the overall unsafe code audit approach, and then a rigorous proof relating these concepts to a formal syntax and semantics of (a simplified version of) Rust, showing that if the approach accepts the program, then the program has no Undefined Behavior. Something like what we did in the Featherweight VeriFast paper.

Here are some specific questions to which I did not immediately find an answer in the materials currently available:

A. Would your tool detect the error in the following incorrect version of Vec::into_raw_parts_with_alloc? If so, how?

    pub fn into_raw_parts_with_alloc(self) -> (*mut T, usize, usize, A) {
        let mut me = self; // was: let mut me = ManuallyDrop::new(self);
        let len = me.len();
        let capacity = me.capacity();
        let ptr = me.as_mut_ptr();
        let alloc = unsafe { ptr::read(me.allocator()) };
        (ptr, len, capacity, alloc)
    }

B. Would your tool detect the error in the following incorrect version of Vec::into_raw_parts_with_alloc? If so, how?

    pub fn into_raw_parts_with_alloc(self) -> (*mut T, usize, usize, A, A) {
        let mut me = ManuallyDrop::new(self);
        let len = me.len();
        let capacity = me.capacity();
        let ptr = me.as_mut_ptr();
        let alloc = unsafe { ptr::read(me.allocator()) };
        let alloc2 = unsafe { ptr::read(me.allocator()) }; // I added this line
        (ptr, len, capacity, alloc, alloc2)
    }

C. Would your tool detect the error in the following incorrect version of Vec::into_raw_parts_with_alloc? If so, how?

    pub fn into_raw_parts_with_alloc(self) -> (*mut T, usize, usize, A) {
        let mut me = Box::new(self);
        let len = me.len();
        let capacity = me.capacity();
        let ptr = me.as_mut_ptr();
        let allocator_ptr = me.allocator() as *const A;
        drop(me);
        let alloc = unsafe { ptr::read(allocator_ptr) };
        (ptr, len, capacity, alloc)
    }

@hxuhack
Copy link

hxuhack commented Oct 14, 2025

The mechanism of RAPx for verification is similar to taint analysis, where unsafe code serves as the taint source and function exits act as the sink. RAPx begins the verification process by identifying all unsafe callees, including intrinsic operations such as raw pointer dereferencing. In the case of the method into_raw_parts_with_alloc(), it includes one unsafe call site, ptr::read<T>(), which is associated with the following safety constraints.

#[cfg_attr(rapx, safety {ValidPtr(src, T, 1)})]
#[cfg_attr(rapx, safety {Typed(src, T)})]
#[cfg_attr(rapx, safety {Align(src, T)})]
#[cfg_attr(rapx, safety { any { hazard.Alias(src, ret), Trait(T, Copy)})]
pub const unsafe fn read<T>(src: *const T) -> T

RAPx considers a method sound if, in all possible executions reaching the unsafe call site, the safety constraints are satisfied.

Case A. No, RAPx will not directly detect this error, because the safety constraint of ptr::read() is always satisfied. Although the method returns a dangling pointer and thus deviates from its specification, it does not trigger undefined behavior.

RAPx will ultimately detect the unsoundness when verifying another method of the same struct, into_chunks<const N: usize>(), which invokes into_raw_parts_with_alloc(). In this method, the dangling pointer returned by into_raw_parts_with_alloc() is subsequently used by another unsafe function, Vec::from_raw_parts_in(), whose safety constraints cannot be satisfied.

pub fn into_chunks<const N: usize>(mut self) -> Vec<[T; N], A> {
     ...
    let (ptr, _, _, alloc) = self.into_raw_parts_with_alloc();
    unsafe { Vec::from_raw_parts_in(ptr.cast(), len / N, cap / N, alloc) }
}

Case B. The only potential source of undefined behavior I can tell lies in a possible aliasing hazard caused by ptr::read(). If the allocator implements Copy, the operation is safe; however, if the allocator implements Drop, RAPx will issue a warning.

Case C. Yes. Since drop(me) causes the pointer allocator_ptr to become dangling, the safety constraint of ptr::read() cannot be satisfied.

In practice, when verifying the soundness of a struct method, we must also account for the influence of its constructors and other methods within the same struct. The soundness criteria and encapsulation requirements are discussed in our arXiv paper, A Trace-based Approach for Code Safety Analysis. We plan to incorporate the abstract interpretation component once it is completed, along with the formal definitions of the key concepts you mentioned by that time. Thank you very much for your insightful questions and thoughtful suggestions.

By the way, I will be giving a talk about the entire idea at RFMIG later this month, and you are warmly invited to attend.

@btj
Copy link

btj commented Oct 14, 2025

Dear @hxuhack , Many thanks for your elaborate response; it's very helpful and kind. I hope you don't mind if I go on a little longer with my questions :-)

Do I understand correctly that your tool reasons about the me.as_mut_ptr() and me.allocator() calls in into_raw_parts_with_alloc, and the call of Box::new in my version C, by looking at their implementations? Does it recursively execute/unfold/inline all calls this way or does it try to determine which calls can be abstracted away without hurting the precision of the analysis? How does it make that determination?

On the other hand, your tool does not require a driver or test harness, i.e. a main function? It does not monomorphize? So it conservatively assumes the allocator might implement Drop and issues a warning for my version B because the allocator is duplicated?

For my case A, notice that the method returns not just a dangling pointer in the first component of the return value, but also a dropped allocator in the fourth component. Would your tool detect it if a (safe or unsafe) client tried to use the dropped allocator?

Looking forward to your talk! I look forward to learning more about the algorithms and data structures underlying your approach, perhaps as applied to into_raw_parts_with_alloc .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants