DRAFT: Create crashdumps for active sandboxes if process is crashing and creating a crashdump #992

simongdavies · 2025-10-28T13:05:44Z

This pull request introduces a new crash handler capability for Hyperlight, enabling automatic sandbox dump generation when the host crashes due to fatal signals (Linux) or exceptions (Windows). The system uses a global, lock-free registry for tracking sandboxes and installs platform-specific crash handlers. The implementation is gated behind a feature flag and is robust against initialization failures and recursive crashes.

Crash handler infrastructure:

Added a new crash_handler.rs module that manages sandbox registration and crash dump generation, using a global lock-free registry (DashMap) and lazy initialization with once_cell. It ensures safe (for crash context) access to hypervisor pointers and robust initialization.
Integrated new dependencies once_cell and dashmap in Cargo.toml to support lock-free global state and lazy initialization for the crash handler subsystem.

Linux-specific crash handling:

Added a new crash_handler/linux.rs module that installs signal handlers for fatal signals (e.g., SIGSEGV, SIGABRT) and chains to previous handlers after generating sandbox dumps. It includes robust detection for whether core dumps are enabled on the system and ensures async-signal-safety is intentionally violated only during crash handling.

Safety and error handling:

The crash handler code is careful to avoid unsafe behavior except during crash contexts, and includes mechanisms to prevent recursive crash handling, detect initialization failures, and fail gracefully when system configuration disables core dumps.

Closes #966

Signed-off-by: Simon Davies <[email protected]>

dblnz

This looks good and clean from my point of view. Great work!

Some tests would be awesome! To know this works as expected with every change.

dblnz · 2025-10-28T13:58:01Z

src/hyperlight_host/src/crash_handler/linux.rs

+        let mut sa: sigaction = std::mem::zeroed();
+        sa.sa_sigaction = crash_signal_handler as usize;
+        sa.sa_flags = SA_SIGINFO | SA_RESTART;
+        libc::sigemptyset(&mut sa.sa_mask);


I am not sure how this works, but is setting the signal handler before retrieving the previous the way it works?

Yes when you set the handler you get back any previous handler which you can then chain to

dblnz · 2025-10-28T14:15:52Z

src/hyperlight_host/src/sandbox/initialized_multi_use.rs

+
+        // Register with crash handler if dumps are enabled
+        #[cfg(feature = "crashdump")]
+        if vm.runtime_config().guest_core_dump


Do we want this on by default?
Imagine running a huge number of sandboxes and something hangs, and you want to terminate it and it starts dumping everything.
I know this is not a production use case, but still. Let me know what you think.
Also, I am curious what happens if we're running 100+ sandboxes of big sizes 500MiB 😄

Do you think that we should have a configuration option for this which turns it on or off globally and set it to false by default?

I think for now we can leave it as is and introduce a configuration option if necessary.

simongdavies added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Oct 28, 2025

simongdavies force-pushed the create-vm-crashdumps-on-process-crash branch from 429ee4f to 0666756 Compare October 28, 2025 13:08

simongdavies changed the title ~~DRAFT: Create crashdumps for VMs if process is crashing and creating a dump~~ DRAFT: Create crashdumps for active sandboxes if process is crashing and creating a crashdump Oct 28, 2025

Create crashdumps for VMs if process is crashing and creating a dump

1852abf

Signed-off-by: Simon Davies <[email protected]>

simongdavies force-pushed the create-vm-crashdumps-on-process-crash branch from 0666756 to 1852abf Compare October 28, 2025 13:50

dblnz reviewed Oct 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DRAFT: Create crashdumps for active sandboxes if process is crashing and creating a crashdump #992

DRAFT: Create crashdumps for active sandboxes if process is crashing and creating a crashdump #992

Uh oh!

simongdavies commented Oct 28, 2025 •

edited

Loading

Uh oh!

dblnz left a comment

Uh oh!

dblnz Oct 28, 2025

Uh oh!

simongdavies Oct 29, 2025

Uh oh!

dblnz Oct 28, 2025

Uh oh!

simongdavies Oct 29, 2025

Uh oh!

dblnz Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DRAFT: Create crashdumps for active sandboxes if process is crashing and creating a crashdump #992

Are you sure you want to change the base?

DRAFT: Create crashdumps for active sandboxes if process is crashing and creating a crashdump #992

Uh oh!

Conversation

simongdavies commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dblnz left a comment

Choose a reason for hiding this comment

Uh oh!

dblnz Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

simongdavies Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

dblnz Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

simongdavies Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

dblnz Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simongdavies commented Oct 28, 2025 •

edited

Loading