-
Notifications
You must be signed in to change notification settings - Fork 27
Rudimentary NVMe emulation fuzzer #966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This tries doing a bunch of random operations against an NVMe device and checks the operations against a limited model of what the results of those operations should be. The initial stab at this is what caught #965, and it caught a bug in an intermediate state of #953 (which other phd tests did notice anyway). This fuzzing would probably be best with actual I/O operations mixed in, and I think that *should* be relatively straightforward to add from here., but as-is it's useful! This would probably be best phrased as a `cargo-fuzz` test to at least get coverage-guided fuzzing. Because of the statefulness of NVMe I think either way we'd want the model of expected device state and a pick-actions-then-run execution to further guide `cargo-fuzz` into useful parts of the device state. The initial approach at this allowed for device reset and migration at arbitrary times via a separate thread. When that required synchronizing the model of device state it was effectively interleaved with "guest" operations on the device, and in practice admin commands are serialized by the `NvmeCtrl` state lock anyway. It may be more interesting to revisit with concurrent I/O operations on submission/completion queues.
|
|
||
| let mut rng = Pcg64::seed_from_u64(seed); | ||
|
|
||
| for _ in 0..1_000 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cargo test gets through this in about 4 seconds, but cargo test --release is .. almost immediate. I'd been running this at 100k iterations instead, there. Even a thousand seems like an OK place to be for CI?
| /// any point as `PciNvme` technically does. (In practice, reset immediately | ||
| /// locks the inner `NvmeCtrl` to do the reset, for administrative options it is | ||
| /// effectively serialized anyway.) | ||
| struct FuzzCtx { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd talked with Patrick just a bit about the idea of having a more composable fuzzing framework as part of Propolis (the library). the implementation in this file is simultaneously:
- an NVMe driver, which is driven ~randomly to exercise
propolis/src/hw/nvme - a model of the NVMe controller state, based on the driver's operations (what queues are created, is the device initialized, etc)
- a model of the synthesis of NVMe spec and controller model: given the current state, what NVMe operations should have what outcomes, and what state transitions are permissible from the current state?
you could swap out the word "NVMe" above for any other devices (the chipset would be interesting!). a reasonable thing to want would be fuzzing a pair of NVMe devices concurrently on the same PCI bridge. or poking a disk and NIC concurrently, or configuring a pair of NVMe devices to write on each others' queues, or ...
in the limit this seems to me like assembling an increasingly chaotic VM and configuring how chaotic it should be. I don't plan on adjusting this in that direction right now, but it seems like an interesting future direction this could take.
This tries doing a bunch of random operations against an NVMe device and checks the operations against a limited model of what the results of those operations should be.
The initial stab at this is what caught #965, and it caught a bug in an intermediate state of #953 (which other phd tests did notice anyway). This fuzzing would probably be best with actual I/O operations mixed in, and I think that should be relatively straightforward to add from here., but as-is it's useful!
This would probably be best phrased as a
cargo-fuzztest to at least get coverage-guided fuzzing. Because of the statefulness of NVMe I think either way we'd want the model of expected device state and a pick-actions-then-run execution to further guidecargo-fuzzinto useful parts of the device state.The initial approach at this allowed for device reset and migration at arbitrary times via a separate thread. When that required synchronizing the model of device state it was effectively interleaved with "guest" operations on the device, and in practice admin commands are serialized by the
NvmeCtrlstate lock anyway. It may be more interesting to revisit with concurrent I/O operations on submission/completion queues.