Skip to content

Conversation

phip1611
Copy link
Contributor

@phip1611 phip1611 commented Sep 15, 2025

Summary of the PR

TL;DR: #322 but backported

As discussed with @roypat at the KVM forum—and as is widely recognized in the community—adding new functionality or updating dependencies is currently quite challenging. For example, Cloud Hypervisor relies on numerous crates that themselves have deep interdependencies with kvm-ioctls and kvm-bindings.

By backporting features and introducing new functionality through a minor release, we can work around these challenges and provide a pragmatic path to incorporate this functionality into Cloud Hypervisor.

TODOs

  • How should I set up this backport PR? Against which branch should I merge it?

Test in Cloud Hypervisor

One can test that this works in Cloud Hypervisor with this patch:

Subject: [PATCH] xxx
migration: migrate nested guest state
---
Index: Cargo.lock
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/Cargo.lock b/Cargo.lock
--- a/Cargo.lock	(revision ad9a1878bfb79a9a64607bfa613ae916169e1bc2)
+++ b/Cargo.lock	(revision 7d1e6ac25b8c1c99d3069c81188d979c75b46e4d)
@@ -1053,9 +1053,7 @@

[[package]]
name = "kvm-bindings"
-version = "0.12.0"
-source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "d4b153a59bb3ca930ff8148655b2ef68c34259a623ae08cf2fb9b570b2e45363"
+version = "0.12.1"
dependencies = [
"serde",
"vmm-sys-util",
@@ -1064,9 +1062,7 @@

[[package]]
name = "kvm-ioctls"
-version = "0.22.0"
-source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "b702df98508cb63ad89dd9beb9f6409761b30edca10d48e57941d3f11513a006"
+version = "0.22.1"
dependencies = [
"bitflags 2.9.4",
"kvm-bindings",
Index: Cargo.toml
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/Cargo.toml b/Cargo.toml
--- a/Cargo.toml	(revision ad9a1878bfb79a9a64607bfa613ae916169e1bc2)
+++ b/Cargo.toml	(revision 41ff284430d18ca3b6e8116c25d461d98e84d711)
@@ -107,8 +107,8 @@
[workspace.dependencies]
# rust-vmm crates
acpi_tables = { git = "https://github.com/rust-vmm/acpi_tables", branch = "main" }
-kvm-bindings = "0.12.0"
-kvm-ioctls = "0.22.0"
+kvm-bindings = "0.12.1"
+kvm-ioctls = "0.22.1"
# TODO: update to 0.13.1+
linux-loader = { git = "https://github.com/rust-vmm/linux-loader", branch = "main" }
mshv-bindings = "0.6.0"
@@ -153,3 +153,7 @@
uuid = { version = "1.18.1" }
wait-timeout = "0.2.1"
zerocopy = { version = "0.8.26", default-features = false }
+
+[patch.crates-io]
+kvm-bindings = { path = "../kvm-rs/kvm-bindings" }
+kvm-ioctls = { path = "../kvm-rs/kvm-ioctls" }
\ No newline at end of file
Index: hypervisor/src/cpu.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/hypervisor/src/cpu.rs b/hypervisor/src/cpu.rs
--- a/hypervisor/src/cpu.rs	(revision ad9a1878bfb79a9a64607bfa613ae916169e1bc2)
+++ b/hypervisor/src/cpu.rs	(revision 7d1e6ac25b8c1c99d3069c81188d979c75b46e4d)
@@ -13,6 +13,7 @@
#[cfg(target_arch = "aarch64")]
use std::sync::Arc;

+use kvm_bindings::nested::KvmNestedStateBuffer;
use thiserror::Error;
#[cfg(not(target_arch = "riscv64"))]
use vm_memory::GuestAddress;
@@ -334,6 +335,10 @@
///
#[error("Failed to inject NMI")]
Nmi(#[source] anyhow::Error),
+    #[error("Failed to get nested guest state")]
+    GetNestedState(#[source] anyhow::Error),
+    #[error("Failed to set nested guest state")]
+    SetNestedState(#[source] anyhow::Error),
     }

#[derive(Debug)]
@@ -514,6 +519,15 @@
/// This function is necessary to snapshot the VM
///
fn state(&self) -> Result<CpuState>;
+
+    /// Get the state of the nested guest from the current vCPU,
+    /// if there is any.
+    #[cfg(target_arch = "x86_64")]
+    fn nested_state(&self) -> Result<Option<KvmNestedStateBuffer>>;
+
+    /// Sets the state of the nested guest for the current vCPU.
+    #[cfg(target_arch = "x86_64")]
+    fn set_nested_state(&self, state: &KvmNestedStateBuffer) -> Result<()>;
     ///
     /// Set the vCPU state.
     /// This function is required when restoring the VM
     Index: hypervisor/src/kvm/mod.rs
     IDEA additional info:
     Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
     <+>UTF-8
     ===================================================================
     diff --git a/hypervisor/src/kvm/mod.rs b/hypervisor/src/kvm/mod.rs
     --- a/hypervisor/src/kvm/mod.rs	(revision ad9a1878bfb79a9a64607bfa613ae916169e1bc2)
     +++ b/hypervisor/src/kvm/mod.rs	(revision 7d1e6ac25b8c1c99d3069c81188d979c75b46e4d)
     @@ -86,6 +86,7 @@
     ///
     #[cfg(any(target_arch = "x86_64", target_arch = "aarch64"))]
     pub use kvm_bindings::kvm_vcpu_events as VcpuEvents;
     +use kvm_bindings::nested::KvmNestedStateBuffer;
     pub use kvm_bindings::{
     KVM_GUESTDBG_ENABLE, KVM_GUESTDBG_SINGLESTEP, KVM_IRQ_ROUTING_IRQCHIP, KVM_IRQ_ROUTING_MSI,
     KVM_MEM_LOG_DIRTY_PAGES, KVM_MEM_READONLY, KVM_MSI_VALID_DEVID, kvm_clock_data,
     @@ -2465,6 +2466,7 @@
     let xcrs = self.get_xcrs()?;
     let lapic_state = self.get_lapic()?;
     let fpu = self.get_fpu()?;
+        let nested_state = self.nested_state()?;

         // Try to get all MSRs based on the list previously retrieved from KVM.
         // If the number of MSRs obtained from GET_MSRS is different from the
@@ -2539,6 +2541,7 @@
xcrs,
mp_state,
tsc_khz,
+            nested_state,
         }
         .into())
  }
  @@ -2705,6 +2708,9 @@
  self.set_xcrs(&state.xcrs)?;
  self.set_lapic(&state.lapic_state)?;
  self.set_fpu(&state.fpu)?;
+        if let Some(nested_state) = state.nested_state {
+            self.set_nested_state(&nested_state)?;
+        }

         if let Some(freq) = state.tsc_khz {
             self.set_tsc_khz(freq)?;
@@ -2975,6 +2981,25 @@
Ok(_) => Ok(()),
}
}
+
+    fn nested_state(&self) -> cpu::Result<Option<KvmNestedStateBuffer>> {
+        let mut buffer = KvmNestedStateBuffer::empty();
+
+        self.fd
+            .lock()
+            .unwrap()
+            .get_nested_state(&mut buffer)
+            .map(|size| size.map(|_| buffer))
+            .map_err(|e| cpu::HypervisorCpuError::GetNestedState(e.into()))
+    }
+
+    fn set_nested_state(&self, state: &KvmNestedStateBuffer) -> cpu::Result<()> {
+        self.fd
+            .lock()
+            .unwrap()
+            .set_nested_state(state)
+            .map_err(|e| cpu::HypervisorCpuError::GetNestedState(e.into()))
+    }
     }

impl KvmVcpu {
Index: hypervisor/src/kvm/x86_64/mod.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/hypervisor/src/kvm/x86_64/mod.rs b/hypervisor/src/kvm/x86_64/mod.rs
--- a/hypervisor/src/kvm/x86_64/mod.rs	(revision ad9a1878bfb79a9a64607bfa613ae916169e1bc2)
+++ b/hypervisor/src/kvm/x86_64/mod.rs	(revision 7d1e6ac25b8c1c99d3069c81188d979c75b46e4d)
@@ -8,6 +8,7 @@
//
//

+use kvm_bindings::nested::KvmNestedStateBuffer;
use serde::{Deserialize, Serialize};
///
/// Export generically-named wrappers of kvm-bindings for Unix-based platforms
@@ -75,6 +76,9 @@
pub xcrs: ExtendedControlRegisters,
pub mp_state: MpState,
pub tsc_khz: Option<u32>,
+    // Option to prevent useless 8K (de)serialization when no nested
+    // state exists.
+    pub nested_state: Option<KvmNestedStateBuffer>,
     }

impl From<SegmentRegister> for kvm_segment {
Index: vmm/src/seccomp_filters.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/vmm/src/seccomp_filters.rs b/vmm/src/seccomp_filters.rs
--- a/vmm/src/seccomp_filters.rs	(revision ad9a1878bfb79a9a64607bfa613ae916169e1bc2)
+++ b/vmm/src/seccomp_filters.rs	(revision 7d1e6ac25b8c1c99d3069c81188d979c75b46e4d)
@@ -103,6 +103,8 @@
pub const KVM_GET_REG_LIST: u64 = 0xc008_aeb0;
pub const KVM_MEMORY_ENCRYPT_OP: u64 = 0xc008_aeba;
pub const KVM_NMI: u64 = 0xae9a;
+    pub const KVM_GET_NESTED_STATE: u64 = 3229658814;
+    pub const KVM_SET_NESTED_STATE: u64 = 1082175167;
     }

// MSHV IOCTL code. This is unstable until the kernel code has been declared stable.
@@ -232,6 +234,8 @@
and![Cond::new(1, ArgLen::Dword, Eq, KVM_SET_USER_MEMORY_REGION,)?],
and![Cond::new(1, ArgLen::Dword, Eq, KVM_SET_VCPU_EVENTS,)?],
and![Cond::new(1, ArgLen::Dword, Eq, KVM_NMI)?],
+        and![Cond::new(1, ArgLen::Dword, Eq, KVM_GET_NESTED_STATE)?],
+        and![Cond::new(1, ArgLen::Dword, Eq, KVM_SET_NESTED_STATE)?],
  ])
  }

@@ -697,6 +701,8 @@
and![Cond::new(1, ArgLen::Dword, Eq, KVM_SET_USER_MEMORY_REGION,)?],
and![Cond::new(1, ArgLen::Dword, Eq, KVM_RUN,)?],
and![Cond::new(1, ArgLen::Dword, Eq, KVM_NMI)?],
+        and![Cond::new(1, ArgLen::Dword, Eq, KVM_GET_NESTED_STATE)?],
+        and![Cond::new(1, ArgLen::Dword, Eq, KVM_SET_NESTED_STATE)?],
  ])
  }

Requirements

Before submitting your PR, please make sure you addressed the following
requirements:

  • All commits in this PR have Signed-Off-By trailers (with
    git commit -s), and the commit message has max 60 characters for the
    summary and max 75 characters for each description line.
  • All added/changed functionality has a corresponding unit/integration
    test.
  • All added/changed public-facing functionality has entries in the "Upcoming
    Release" section of CHANGELOG.md (if no such section exists, please create one).
  • Any newly added unsafe code is properly documented.

This type is a helper, making the use of get_nested_state() and
set_nested_state(), which are added in a following commit, much more
convenient.

Note that this type's name uses UpperCamelCase as it is not just
a plain old data type but actually contains some logic: the
`size` field is properly initialized.

Effectively, KVM expects a dynamic buffer with a header reporting
the size to either store the nested state in or load it from. As such
data structures with a certain alignment are challenging to work with
(note that Vec<u8> always have an alignment of 1 but we need 4),
this type sacrifices a little memory overhead in some cases for
better UX; copying 8K once is cheap anyway.

Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
These calls are relevant for live-migration and state save/resume
when nested virtualization is used.

I tested everything with a nested guest in Cloud Hypervisor, but these
patches are not yet upstream.

Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
@phip1611 phip1611 force-pushed the backport-nested-stuff branch from f2d41f7 to 684e2e2 Compare September 15, 2025 07:26
Copy link
Member

@RuoqingHe RuoqingHe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want to backport patches to main branch 🤔

Make this more fail-safe.

Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
@phip1611
Copy link
Contributor Author

Are you sure you want to backport patches to main branch 🤔

I am wondering about this too :) I ask for guidance in the PR description.

How should I set up this backport PR? Against which branch should I merge it?

Happy to hear suggestions!

@RuoqingHe
Copy link
Member

How should I set up this backport PR? Against which branch should I merge it?

Basically we need to create a branch (if not exist) at the commit of kvm-ioctls v0.22 and kvm-bindings 0.12, after that you can just PR to that branch IIUC

@phip1611
Copy link
Contributor Author

Closing this in favor of #349.

@phip1611 phip1611 closed this Sep 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants