Skip to content

Conversation

adamperlin
Copy link
Contributor

This PR prevents fuzz_host_call from hitting a memory leak upon encountering host function calling errors. It restores to a known clean snapshot on each iteration. This fix should not be needed once #826 is fixed.

Snapshot restore initially wasn't working in the fuzzing case due to a bug discovered by @ludfjig in call_type_erased_guest_function_by_name (snapshot wasn't being set to None) so this bug has been fixed.

This PR also more explicitly ignores some expected errors that may come up from host call fuzzing.

Copy link
Contributor

@ludfjig ludfjig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, lgtm

1. Setting sandbox snapshot to none in
   call_type_erased_guest_function_by_name
2. Restoring to an initial snapshot on each fuzzing iteration to
avoid hitting a known memory leak.

Signed-off-by: adamperlin <[email protected]>
@adamperlin adamperlin force-pushed the adamperlin/triage-fuzzing-memory-leak branch from 6306e2d to 426baaf Compare August 28, 2025 21:33
@adamperlin adamperlin enabled auto-merge (squash) August 28, 2025 21:34
@adamperlin adamperlin merged commit a835ef5 into hyperlight-dev:main Aug 28, 2025
33 checks passed
Comment on lines +384 to +385
// Reset snapshot since we are mutating the sandbox state
self.snapshot = None;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we clearing this here? Shouldn't the restore call above take care of this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise I would have expected it too look like

pub fn call_guest_function_by_name<Output: SupportedReturnType>(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bug: since this function is wrapping call_guest_function_by_name_no_reset and we're mutating the guest sandbox, the attached snapshot will be invalidated here, right? If we don't clear the snapshot, the restore call will hit the "snapshot exists" case and won't do the restore.

Copy link
Contributor Author

@adamperlin adamperlin Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the moment we could change the behavior to do the snapshot and restore in the call_type_erased_guest_function_by_name though, but since it seemed more clear to do it in the fuzzing code!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I miss read this. I think this is fine and mimics the 'call func'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I'm glad it looks fine!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bugfix For PRs that fix bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants