Kernel KVM virtualization development
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] KVM: x86: Relay a nested Hyper-V root's vmbus posts to L0
@ 2026-06-17 14:51 Robert Nowotny
  2026-06-17 15:53 ` Sean Christopherson
  0 siblings, 1 reply; 2+ messages in thread
From: Robert Nowotny @ 2026-06-17 14:51 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, kvm, linux-kernel, x86, jostarks


[-- Attachment #1.1: Type: text/plain, Size: 5259 bytes --]

This RFC asks for direction on a small KVM/x86 addition before adding a 
selftest
and an SVM counterpart. It lets a nested Hyper-V root partition's vmbus 
come up
when the L1 hypervisor runs under KVM with a userspace VMM that owns the 
host
vmbus endpoint.

Patch 1 renames nested_evmcs_l2_tlb_flush_enabled() to
nested_evmcs_l2_direct_hypercall_enabled(), since the predicate is 
really "L1
granted this L2 the eVMCS direct-hypercall facility" and a second caller now
shares it. No functional change.

Patch 2 adds the relay.

The userspace user is OpenVMM (https://github.com/microsoft/openvmm); the
companion change that enables this capability with the bitmask will be 
posted to
OpenVMM later.

Problem
-------
A Windows guest that enables Hyper-V/VBS runs its own kernel as the root
partition of a nested hypervisor, i.e. as an L2 guest: guest kernel ->
nested hypervisor (L1) -> KVM (L0). The root's vmbus never connects. Its
HvPostMessage(InitiateContact) is an L2 VMCALL that exits to L0 and is
reflected up to L1, which has no path to forward it to the userspace 
VMM. The
guest bugchecks 0x7B early in boot.

What the patch does
-------------------
Add a per-VM capability whose argument is a bitmask of the nested Hyper-V
hypercall classes userspace wants kept in L0 (HvPostMessage, HvSignalEvent).
For a selected class, and when L1 has authorized the L2 for direct nested
hypercalls (nested_evmcs_l2_direct_hypercall_enabled(), the gate KVM already
honors for the L2 TLB-flush hypercall), the L2 VMCALL is handled in L0 
instead
of reflected to L1: KVM clears the nested bit, translates the L2 GPA in the
input parameter to an L1 GPA via the nested MMU, and lets the existing
hypercall path deliver the post to userspace via KVM_EXIT_HYPERV, exactly as
for a non-nested guest.

Why this belongs in the kernel
------------------------------
The message handling already lives in userspace and does not move: a 
non-nested
HvPostMessage exits to userspace today via KVM_EXIT_HYPERV, and the relayed
nested post takes the same exit. Only two steps cannot be done in 
userspace with
the current uAPI, and both are kernel-only primitives:

   1. Suppressing nested exit reflection. The "keep this L2 VMCALL in L0 
instead
      of reflecting to L1" decision is made in 
nested_vmx_reflect_vmexit(); KVM
      does not exit to userspace on a nested L2 VM-exit before deciding
      reflection, and adding such an exit would be a much broader and 
riskier
      ABI. A nested exit also cannot be cleanly reflected to L1 after a 
userspace
      round-trip, which is why the decision stays in the kernel.
   2. Translating the L2 GPA to an L1 GPA, which needs the nested MMU / 
shadow
      EPT that userspace cannot walk.

The relayable set is a userspace-supplied bitmask
-------------------------------------------------
args[0] selects which nested Hyper-V hypercall classes to keep in L0. The
in-kernel decision stays in the kernel, the choice of which calls to 
relay is
userspace's, and the kernel carries no vmbus-specific policy. New relayable
nested hypercalls can be added without another kernel change.

Scope and limitations
---------------------
   - VMX-only; no SVM counterpart yet.
   - The capability number 249 is a placeholder pending assignment.
   - No selftest yet (this is an RFC for direction). A selftest and, if the
     relay stays, an SVM path would come with the non-RFC series.

Tooling transparency
--------------------
This work was developed with AI assistance (Claude, claude-opus-4-8), 
reflected
in each patch's Assisted-by tag. The assistant analyzed the nested-exit
reflection and Hyper-V hypercall paths, drafted the comments and 
changelogs, and
cross-checked the behavior against the TLFS and the existing L2 TLB-flush
handling. The mechanism was derived from runtime analysis of a stock Windows
guest that bugchecks 0x7B without the relay and boots with it. The 
submitter has
reviewed the change in full and takes responsibility for it.

Testing
-------
The relay mechanism was validated on a Proxmox VE 7.0.2 kernel (the same 
logic,
applied to that tree): a stock nested Windows guest under a userspace 
VMM that
owns the host vmbus endpoint fails to bring up its root vmbus (0x7B) 
without the
capability and boots to the full desktop with it. checkpatch is clean on 
both
patches. A mainline KVM_INTEL=m KVM_AMD=m KVM_WERROR=y build and a KVM 
selftest
are still to come with the non-RFC series.


Yours sincerely

Ing. Robert Nowotny
Ing. Robert Nowotny
CTO, Executive Technical Director

Rotek GmbH <https://www.rotek.at>

------------------------------------------------------------------
*Company Information :*
Rotek Handels GmbH
Handelsstrasse 4
A-2201 Hagenbrunn
Austria

Tel : +43-2246-20791-23
Fax : +43-2246-20791-50

Executive Director: Robert Rernböck
Registered under : FN271982z, Landesgericht Korneuburg
VAT Number : ATU62139135
------------------------------------------------------------------
*CONTACT:*
mailto: rnowotny@rotek.at
Web: https://www.rotek.at
------------------------------------------------------------------


[-- Attachment #1.2.1: Type: text/html, Size: 7816 bytes --]

[-- Attachment #1.2.2: xybJEaKlPIe8gd8s.jpg --]
[-- Type: image/jpeg, Size: 9471 bytes --]

[-- Attachment #1.2.3: uPPOdcZtOgplBxIt.png --]
[-- Type: image/png, Size: 18146 bytes --]

[-- Attachment #2: Kryptografische S/MIME-Signatur --]
[-- Type: application/pkcs7-signature, Size: 4359 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [RFC PATCH 0/2] KVM: x86: Relay a nested Hyper-V root's vmbus posts to L0
  2026-06-17 14:51 [RFC PATCH 0/2] KVM: x86: Relay a nested Hyper-V root's vmbus posts to L0 Robert Nowotny
@ 2026-06-17 15:53 ` Sean Christopherson
  0 siblings, 0 replies; 2+ messages in thread
From: Sean Christopherson @ 2026-06-17 15:53 UTC (permalink / raw)
  To: Robert Nowotny
  Cc: Paolo Bonzini, Vitaly Kuznetsov, kvm, linux-kernel, x86, jostarks

On Wed, Jun 17, 2026, Robert Nowotny wrote:
> This RFC asks for direction on a small KVM/x86 addition before adding a
> selftest and an SVM counterpart. It lets a nested Hyper-V root partition's
> vmbus come up when the L1 hypervisor runs under KVM with a userspace VMM that
> owns the host vmbus endpoint.
> 
> Patch 1 renames nested_evmcs_l2_tlb_flush_enabled() to
> nested_evmcs_l2_direct_hypercall_enabled(), since the predicate is really
> "L1
> granted this L2 the eVMCS direct-hypercall facility" and a second caller now
> shares it. No functional change.
> 
> Patch 2 adds the relay.

I didn't get any patches.  Lore doesn't have them either.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-17 15:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-17 14:51 [RFC PATCH 0/2] KVM: x86: Relay a nested Hyper-V root's vmbus posts to L0 Robert Nowotny
2026-06-17 15:53 ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox