public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping
@ 2026-03-13  0:10 Yosry Ahmed
  2026-03-13  0:10 ` [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator Yosry Ahmed
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-13  0:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

Jim pointed out that VMRUN/VMLOAD/VMSAVE injecting a #GP when the vmcb12
GPA is valid but not mappable is not architectural [1]. The series
handles them as emulation failures and (mostly) exits to userspace
instead. It also fixes the checks performed on the vmcb12 GPA (i.e. RAX)
in a few places.

v2 -> v3:
- Drop the patch simplifying error handling of
  nested_svm_copy_vmcb12_to_cache() as it was picked up into
  kvm-x86/next.
- Drop the legal GPA check on RAX in the emulator instead of fixing it
  [Sean].
- Fix legal GPA check on RAX in the #GP interception path [Sean].
- Move legal GPA check to VMRUN/VMLOAD/VMSAVE interception handlers
  [Yosry].
- Update the selftest to use the first GPA after memslots, rather than
  the maximum legal GPA, as the unmappable GPA. This is needed because
  the maximum legal GPA sometimes still produces a #GP if it's in a
  reserved area [Yosry].

v2: https://lore.kernel.org/kvm/20260306210900.1933788-1-yosry@kernel.org/


Yosry Ahmed (7):
  KVM: SVM: Drop RAX check for SVM instructions from the emulator
  KVM: SVM: Check that RAX has legal GPA on #GP interception of SVM
    insns
  KVM: SVM: Move RAX legality check to SVM insn interception handlers
  KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation
  KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12
    fails
  KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa
  KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's
    name

 arch/x86/kvm/emulate.c                        |  17 +-
 arch/x86/kvm/svm/nested.c                     |  11 +-
 arch/x86/kvm/svm/svm.c                        |  37 ++--
 tools/testing/selftests/kvm/Makefile.kvm      |   2 +-
 .../kvm/x86/svm_nested_invalid_vmcb12_gpa.c   |  98 ----------
 .../selftests/kvm/x86/svm_nested_vmcb12_gpa.c | 176 ++++++++++++++++++
 6 files changed, 203 insertions(+), 138 deletions(-)
 delete mode 100644 tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c
 create mode 100644 tools/testing/selftests/kvm/x86/svm_nested_vmcb12_gpa.c


base-commit: bfd7f4adc1230373c25e1b787a6f1ee407eb0656
-- 
2.53.0.851.ga537e3e6e9-goog


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
  2026-03-13  0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
@ 2026-03-13  0:10 ` Yosry Ahmed
  2026-03-15 12:55   ` Paolo Bonzini
  2026-03-13  0:10 ` [PATCH v3 2/7] KVM: SVM: Check that RAX has legal GPA on #GP interception of SVM insns Yosry Ahmed
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-13  0:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

The check for legal GPA in RAX hardcodes a mask for 48 bits of physical
address width. This incorrectly injects a #GP for valid 52-bit GPAs.
However, instead of fixing the check, remove it completely as it is
unnecessary.

If RAX contains an illegal GPA, the CPU should inject #GP. If KVM
intercepts #GP, the emulator is only used for decoding the instruction.
Otherwise, if RAX is illegal from the guest's perspective but not the
host's (due to allow_smaller_maxphyaddr), then KVM should always
intercept the instructions (as NPT and VLS should both be disabled). The
interception path for VMRUN/VMSAVE/VMLOAD also does not invoke the
emulator either. Hence, the emulator can never be invoked with an
actually illegal RAX.

Outside of forced emulation or code stream rewriting, the emulator
should only be invoked for these instructions in cases such as RAX
having a legal GPA that lies outside guest memory, as the #NPF
interception handler will try to emulate the instruction after failing
to create a proper mapping in the NPT. In this case, the emulator's
responsibility ends with checking pre-intercept exceptions and
intercepts, it does not actually emulate these instructions.

According to the APM, #GP due to invalid op happens after the
interception check:

  Generally, instruction intercepts are checked after simple exceptions
  (such as #GP—when CPL is incorrect—or #UD) have been checked, but
  before exceptions related to memory accesses (such as page faults) and
  exceptions based on specific operand values.

Arguably, the emulator's checks for EFER.SVME and intercepts are also
unnecessary. If EFER.SVME is cleared or if L1 intercepts
VMRUN/VMSAVE/VMLOAD (for nested), then KVM should always be intercepting
these instructions anyway, and the emulator should not be invoked (see
above). Leave dealing with that for later.

Fixes: 01de8b09e606 ("KVM: SVM: Add intercept checks for SVM instructions")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/emulate.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 6145dac4a605a..a449a00555da1 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3883,17 +3883,6 @@ static int check_svme(struct x86_emulate_ctxt *ctxt)
 	return X86EMUL_CONTINUE;
 }
 
-static int check_svme_pa(struct x86_emulate_ctxt *ctxt)
-{
-	u64 rax = reg_read(ctxt, VCPU_REGS_RAX);
-
-	/* Valid physical address? */
-	if (rax & 0xffff000000000000ULL)
-		return emulate_gp(ctxt, 0);
-
-	return check_svme(ctxt);
-}
-
 static int check_rdtsc(struct x86_emulate_ctxt *ctxt)
 {
 	u64 cr4 = ctxt->ops->get_cr(ctxt, 4);
@@ -3997,10 +3986,10 @@ static const struct opcode group7_rm2[] = {
 };
 
 static const struct opcode group7_rm3[] = {
-	DIP(SrcNone | Prot | Priv,		vmrun,		check_svme_pa),
+	DIP(SrcNone | Prot | Priv,		vmrun,		check_svme),
 	II(SrcNone  | Prot | EmulateOnUD,	em_hypercall,	vmmcall),
-	DIP(SrcNone | Prot | Priv,		vmload,		check_svme_pa),
-	DIP(SrcNone | Prot | Priv,		vmsave,		check_svme_pa),
+	DIP(SrcNone | Prot | Priv,		vmload,		check_svme),
+	DIP(SrcNone | Prot | Priv,		vmsave,		check_svme),
 	DIP(SrcNone | Prot | Priv,		stgi,		check_svme),
 	DIP(SrcNone | Prot | Priv,		clgi,		check_svme),
 	DIP(SrcNone | Prot | Priv,		skinit,		check_svme),
-- 
2.53.0.851.ga537e3e6e9-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/7] KVM: SVM: Check that RAX has legal GPA on #GP interception of SVM insns
  2026-03-13  0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
  2026-03-13  0:10 ` [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator Yosry Ahmed
@ 2026-03-13  0:10 ` Yosry Ahmed
  2026-03-13  0:10 ` [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers Yosry Ahmed
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-13  0:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

KVM intercepts #GP when EFER.SVME is set if the CPU does not have
X86_FEATURE_SVME_ADDR_CHK to work around an erratum where some CPUs
check EAX against reserved memory regions. KVM re-injects the #GP if an
SVM instruction was executed with mis-aligned RAX, and it otherwise
emulates it.

However, a #GP should also be reinjected if RAX contains an illegal GPA,
according to the APM, one of #GP conditions is:

  rAX referenced a physical address above the maximum
  supported physical address.

Replace the PAGE_MASK check with page_address_valid(), which checks both
page-alignment as well as the legality of the GPA based on the vCPU's
MAXPHYADDR.

Note that this is currently not a problem, because kvm_vcpu_map() should
fail on illegal GPAs and inject a #GP anyway. However, following patches
will change the failure behavior of kvm_vcpu_map(), so make sure the #GP
interception handler does this appropriately.

Fixes: 82a11e9c6fa2 ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions")
Fixes: d1cba6c92237 ("KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/svm/svm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d98fbc0e58e8f..796a6887305d6 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2319,8 +2319,7 @@ static int gp_interception(struct kvm_vcpu *vcpu)
 			return kvm_emulate_instruction(vcpu,
 				EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
 	} else {
-		/* All SVM instructions expect page aligned RAX */
-		if (svm->vmcb->save.rax & ~PAGE_MASK)
+		if (!page_address_valid(vcpu, svm->vmcb->save.rax))
 			goto reinject;
 
 		return emulate_svm_instr(vcpu, opcode);
-- 
2.53.0.851.ga537e3e6e9-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers
  2026-03-13  0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
  2026-03-13  0:10 ` [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator Yosry Ahmed
  2026-03-13  0:10 ` [PATCH v3 2/7] KVM: SVM: Check that RAX has legal GPA on #GP interception of SVM insns Yosry Ahmed
@ 2026-03-13  0:10 ` Yosry Ahmed
  2026-03-13 18:17   ` Yosry Ahmed
  2026-03-13  0:10 ` [PATCH v3 4/7] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation Yosry Ahmed
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-13  0:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

When #GP is intercepted by KVM, the #GP interception handler checks
whether the GPA in RAX is legal and reinjects the #GP accordingly.
Otherwise, it calls into the appropriate interception handler for
VMRUN/VMLOAD/VMSAVE. The intercept handlers do not check RAX.

However, according to the APM, the interception takes precedence
over #GP due to an invalid operand:

  Generally, instruction intercepts are checked after simple exceptions
  (such as #GP—when CPL is incorrect—or #UD) have been checked, but
  before exceptions related to memory accesses (such as page faults) and
  exceptions based on specific operand values.

Move the check into the interception handlers for VMRUN/VMLOAD/VMSAVE as
the CPU does not check RAX before the interception.

Opportunisitically make the non-SVM insn path in gp_interception() do an
early return to reduce intendation.

Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/svm/nested.c |  5 +++++
 arch/x86/kvm/svm/svm.c    | 34 +++++++++++++++++-----------------
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 5ff01d2ac85e4..016bf88ec2def 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1115,6 +1115,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 
 	vmcb12_gpa = svm->vmcb->save.rax;
 
+	if (!page_address_valid(vcpu, vmcb12_gpa)) {
+		kvm_inject_gp(vcpu, 0);
+		return 1;
+	}
+
 	ret = nested_svm_copy_vmcb12_to_cache(vcpu, vmcb12_gpa);
 	if (ret) {
 		if (ret == -EFAULT) {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 796a6887305d6..f019a3f7705ae 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2183,6 +2183,7 @@ static int intr_interception(struct kvm_vcpu *vcpu)
 static int vmload_vmsave_interception(struct kvm_vcpu *vcpu, bool vmload)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	u64 vmcb12_gpa = svm->vmcb->save.rax;
 	struct vmcb *vmcb12;
 	struct kvm_host_map map;
 	int ret;
@@ -2190,7 +2191,12 @@ static int vmload_vmsave_interception(struct kvm_vcpu *vcpu, bool vmload)
 	if (nested_svm_check_permissions(vcpu))
 		return 1;
 
-	ret = kvm_vcpu_map(vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map);
+	if (!page_address_valid(vcpu, vmcb12_gpa)) {
+		kvm_inject_gp(vcpu, 0);
+		return 1;
+	}
+
+	ret = kvm_vcpu_map(vcpu, gpa_to_gfn(vmcb12_gpa), &map);
 	if (ret) {
 		if (ret == -EINVAL)
 			kvm_inject_gp(vcpu, 0);
@@ -2306,24 +2312,18 @@ static int gp_interception(struct kvm_vcpu *vcpu)
 		goto reinject;
 
 	opcode = svm_instr_opcode(vcpu);
+	if (opcode != NONE_SVM_INSTR)
+		return emulate_svm_instr(vcpu, opcode);
 
-	if (opcode == NONE_SVM_INSTR) {
-		if (!enable_vmware_backdoor)
-			goto reinject;
-
-		/*
-		 * VMware backdoor emulation on #GP interception only handles
-		 * IN{S}, OUT{S}, and RDPMC.
-		 */
-		if (!is_guest_mode(vcpu))
-			return kvm_emulate_instruction(vcpu,
-				EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
-	} else {
-		if (!page_address_valid(vcpu, svm->vmcb->save.rax))
-			goto reinject;
+	if (!enable_vmware_backdoor)
+		goto reinject;
 
-		return emulate_svm_instr(vcpu, opcode);
-	}
+	/*
+	 * VMware backdoor emulation on #GP interception only handles
+	 * IN{S}, OUT{S}, and RDPMC.
+	 */
+	if (!is_guest_mode(vcpu))
+		return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
 
 reinject:
 	kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
-- 
2.53.0.851.ga537e3e6e9-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 4/7] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation
  2026-03-13  0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
                   ` (2 preceding siblings ...)
  2026-03-13  0:10 ` [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers Yosry Ahmed
@ 2026-03-13  0:10 ` Yosry Ahmed
  2026-03-13  0:10 ` [PATCH v3 5/7] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails Yosry Ahmed
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-13  0:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

Currently, a #GP is only injected if kvm_vcpu_map() fails with -EINVAL.
But it could also fail with -EFAULT if creating a host mapping failed.
Inject a #GP in all cases, no reason to treat failure modes differently.

Similar to commit 01ddcdc55e09 ("KVM: nSVM: Always inject a #GP if
mapping VMCB12 fails on nested VMRUN"), treat all failures equally.

Fixes: 8c5fbf1a7231 ("KVM/nSVM: Use the new mapping API for mapping guest memory")
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/svm/svm.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f019a3f7705ae..3b1516ea45d4f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2196,10 +2196,8 @@ static int vmload_vmsave_interception(struct kvm_vcpu *vcpu, bool vmload)
 		return 1;
 	}
 
-	ret = kvm_vcpu_map(vcpu, gpa_to_gfn(vmcb12_gpa), &map);
-	if (ret) {
-		if (ret == -EINVAL)
-			kvm_inject_gp(vcpu, 0);
+	if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map)) {
+		kvm_inject_gp(vcpu, 0);
 		return 1;
 	}
 
-- 
2.53.0.851.ga537e3e6e9-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 5/7] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails
  2026-03-13  0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
                   ` (3 preceding siblings ...)
  2026-03-13  0:10 ` [PATCH v3 4/7] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation Yosry Ahmed
@ 2026-03-13  0:10 ` Yosry Ahmed
  2026-03-13  0:10 ` [PATCH v3 6/7] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa Yosry Ahmed
  2026-03-13  0:10 ` [PATCH v3 7/7] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name Yosry Ahmed
  6 siblings, 0 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-13  0:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

KVM currently injects a #GP if mapping vmcb12 fails when emulating
VMRUN/VMLOAD/VMSAVE. This is not architectural behavior, as #GP should
only be injected if the physical address is not supported or not
aligned. Instead, handle it as an emulation failure, similar to how nVMX
handles failures to read/write guest memory in several emulation paths.

When virtual VMLOAD/VMSAVE is enabled, if vmcb12's GPA is not mapped in
the NPTs a VMEXIT(#NPF) will be generated, and KVM will install an MMIO
SPTE and emulate the instruction if there is no corresponding memslot.
x86_emulate_insn() will return EMULATION_FAILED as VMLOAD/VMSAVE are not
handled as part of the twobyte_insn cases.

Even though this will also result in an emulation failure, it will only
result in a straight return to userspace if
KVM_CAP_EXIT_ON_EMULATION_FAILURE is set. Otherwise, it would inject #UD
and only exit to userspace if not in guest mode. So the behavior is
slightly different if virtual VMLOAD/VMSAVE is enabled.

Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/svm/nested.c | 6 ++----
 arch/x86/kvm/svm/svm.c    | 6 ++----
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 016bf88ec2def..8320c98b704ce 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1122,10 +1122,8 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 
 	ret = nested_svm_copy_vmcb12_to_cache(vcpu, vmcb12_gpa);
 	if (ret) {
-		if (ret == -EFAULT) {
-			kvm_inject_gp(vcpu, 0);
-			return 1;
-		}
+		if (ret == -EFAULT)
+			return kvm_handle_memory_failure(vcpu, X86EMUL_IO_NEEDED, NULL);
 
 		/* Advance RIP past VMRUN as part of the nested #VMEXIT. */
 		return kvm_skip_emulated_instruction(vcpu);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3b1516ea45d4f..a0dacbeaa3c5a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2196,10 +2196,8 @@ static int vmload_vmsave_interception(struct kvm_vcpu *vcpu, bool vmload)
 		return 1;
 	}
 
-	if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map)) {
-		kvm_inject_gp(vcpu, 0);
-		return 1;
-	}
+	if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map))
+		return kvm_handle_memory_failure(vcpu, X86EMUL_IO_NEEDED, NULL);
 
 	vmcb12 = map.hva;
 
-- 
2.53.0.851.ga537e3e6e9-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 6/7] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa
  2026-03-13  0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
                   ` (4 preceding siblings ...)
  2026-03-13  0:10 ` [PATCH v3 5/7] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails Yosry Ahmed
@ 2026-03-13  0:10 ` Yosry Ahmed
  2026-03-13  0:10 ` [PATCH v3 7/7] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name Yosry Ahmed
  6 siblings, 0 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-13  0:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

The test currently allegedly makes sure that VMRUN causes a #GP in
vmcb12 GPA is valid but unmappable. However, it calls run_guest() with
an the test vmcb12 GPA, and the #GP is produced from VMLOAD, not VMRUN.

Additionally, the underlying just changed to match architectural
behavior, and all of VMRUN/VMLOAD/VMSAVE fail emulation if vmcb12 cannot
be mapped. The CPU still injects a #GP if the vmcb12 GPA exceeds
maxphyaddr.

Rework the test such to use the KVM_ONE_VCPU_TEST[_SUITE] harness, and
test all of VMRUN/VMLOAD/VMSAVE with both an invalid GPA (-1ULL) causing
a #GP, and a valid but unmappable GPA causing emulation failure. Execute
the instructions directly from L1 instead of run_guest() to make sure
the #GP or emulation failure is produced by the right instruction.

Leave the #VMEXIT with unmappable GPA test case as-is, but wrap it with
a test harness as well.

Opportunisitically drop gp_triggered, as the test already checks that
a #GP was injected through a SYNC, and add an assertion that the max
legal GPA is in fact not mapped by userspace (i.e. KVM cannot map it).

Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 .../kvm/x86/svm_nested_invalid_vmcb12_gpa.c   | 152 +++++++++++++-----
 1 file changed, 115 insertions(+), 37 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c b/tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c
index c6d5f712120d1..569869bed20b5 100644
--- a/tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c
+++ b/tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c
@@ -6,6 +6,8 @@
 #include "vmx.h"
 #include "svm_util.h"
 #include "kselftest.h"
+#include "kvm_test_harness.h"
+#include "test_util.h"
 
 
 #define L2_GUEST_STACK_SIZE 64
@@ -13,86 +15,162 @@
 #define SYNC_GP 101
 #define SYNC_L2_STARTED 102
 
-u64 valid_vmcb12_gpa;
-int gp_triggered;
+static unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
 
 static void guest_gp_handler(struct ex_regs *regs)
 {
-	GUEST_ASSERT(!gp_triggered);
 	GUEST_SYNC(SYNC_GP);
-	gp_triggered = 1;
-	regs->rax = valid_vmcb12_gpa;
 }
 
-static void l2_guest_code(void)
+static void l2_code(void)
 {
 	GUEST_SYNC(SYNC_L2_STARTED);
 	vmcall();
 }
 
-static void l1_guest_code(struct svm_test_data *svm, u64 invalid_vmcb12_gpa)
+static void l1_vmrun(struct svm_test_data *svm, u64 gpa)
 {
-	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+	generic_svm_setup(svm, l2_code, &l2_guest_stack[L2_GUEST_STACK_SIZE]);
 
-	generic_svm_setup(svm, l2_guest_code,
-			  &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+	asm volatile ("vmrun %[gpa]" : : [gpa] "a" (gpa) : "memory");
+}
 
-	valid_vmcb12_gpa = svm->vmcb_gpa;
+static void l1_vmload(struct svm_test_data *svm, u64 gpa)
+{
+	generic_svm_setup(svm, l2_code, &l2_guest_stack[L2_GUEST_STACK_SIZE]);
 
-	run_guest(svm->vmcb, invalid_vmcb12_gpa); /* #GP */
+	asm volatile ("vmload %[gpa]" : : [gpa] "a" (gpa) : "memory");
+}
+
+static void l1_vmsave(struct svm_test_data *svm, u64 gpa)
+{
+	generic_svm_setup(svm, l2_code, &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+	asm volatile ("vmsave %[gpa]" : : [gpa] "a" (gpa) : "memory");
+}
+
+static void l1_vmexit(struct svm_test_data *svm, u64 gpa)
+{
+	generic_svm_setup(svm, l2_code, &l2_guest_stack[L2_GUEST_STACK_SIZE]);
 
-	/* GP handler should jump here */
+	run_guest(svm->vmcb, svm->vmcb_gpa);
 	GUEST_ASSERT(svm->vmcb->control.exit_code == SVM_EXIT_VMMCALL);
 	GUEST_DONE();
 }
 
-int main(int argc, char *argv[])
+static u64 unmappable_gpa(struct kvm_vcpu *vcpu)
+{
+	struct userspace_mem_region *region;
+	u64 region_gpa_end, vm_gpa_end = 0;
+	int i;
+
+	hash_for_each(vcpu->vm->regions.slot_hash, i, region, slot_node) {
+		region_gpa_end = region->region.guest_phys_addr + region->region.memory_size;
+		vm_gpa_end = max(vm_gpa_end, region_gpa_end);
+	}
+
+	return vm_gpa_end;
+}
+
+static void test_invalid_vmcb12(struct kvm_vcpu *vcpu)
 {
-	struct kvm_x86_state *state;
 	vm_vaddr_t nested_gva = 0;
-	struct kvm_vcpu *vcpu;
-	uint32_t maxphyaddr;
-	u64 max_legal_gpa;
-	struct kvm_vm *vm;
 	struct ucall uc;
 
-	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SVM));
 
-	vm = vm_create_with_one_vcpu(&vcpu, l1_guest_code);
 	vm_install_exception_handler(vcpu->vm, GP_VECTOR, guest_gp_handler);
-
-	/*
-	 * Find the max legal GPA that is not backed by a memslot (i.e. cannot
-	 * be mapped by KVM).
-	 */
-	maxphyaddr = kvm_cpuid_property(vcpu->cpuid, X86_PROPERTY_MAX_PHY_ADDR);
-	max_legal_gpa = BIT_ULL(maxphyaddr) - PAGE_SIZE;
-	vcpu_alloc_svm(vm, &nested_gva);
-	vcpu_args_set(vcpu, 2, nested_gva, max_legal_gpa);
-
-	/* VMRUN with max_legal_gpa, KVM injects a #GP */
+	vcpu_alloc_svm(vcpu->vm, &nested_gva);
+	vcpu_args_set(vcpu, 2, nested_gva, -1ULL);
 	vcpu_run(vcpu);
+
 	TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
 	TEST_ASSERT_EQ(get_ucall(vcpu, &uc), UCALL_SYNC);
 	TEST_ASSERT_EQ(uc.args[1], SYNC_GP);
+}
+
+static void test_unmappable_vmcb12(struct kvm_vcpu *vcpu)
+{
+	vm_vaddr_t nested_gva = 0;
+
+	vcpu_alloc_svm(vcpu->vm, &nested_gva);
+	vcpu_args_set(vcpu, 2, nested_gva, unmappable_gpa(vcpu));
+	vcpu_run(vcpu);
+
+	TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_INTERNAL_ERROR);
+	TEST_ASSERT_EQ(vcpu->run->emulation_failure.suberror, KVM_INTERNAL_ERROR_EMULATION);
+}
+
+static void test_unmappable_vmcb12_vmexit(struct kvm_vcpu *vcpu)
+{
+	struct kvm_x86_state *state;
+	vm_vaddr_t nested_gva = 0;
+	struct ucall uc;
 
 	/*
-	 * Enter L2 (with a legit vmcb12 GPA), then overwrite vmcb12 GPA with
-	 * max_legal_gpa. KVM will fail to map vmcb12 on nested VM-Exit and
+	 * Enter L2 (with a legit vmcb12 GPA), then overwrite vmcb12 GPA with an
+	 * unmappable GPA. KVM will fail to map vmcb12 on nested VM-Exit and
 	 * cause a shutdown.
 	 */
+	vcpu_alloc_svm(vcpu->vm, &nested_gva);
+	vcpu_args_set(vcpu, 2, nested_gva, unmappable_gpa(vcpu));
 	vcpu_run(vcpu);
 	TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
 	TEST_ASSERT_EQ(get_ucall(vcpu, &uc), UCALL_SYNC);
 	TEST_ASSERT_EQ(uc.args[1], SYNC_L2_STARTED);
 
 	state = vcpu_save_state(vcpu);
-	state->nested.hdr.svm.vmcb_pa = max_legal_gpa;
+	state->nested.hdr.svm.vmcb_pa = unmappable_gpa(vcpu);
 	vcpu_load_state(vcpu, state);
 	vcpu_run(vcpu);
 	TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_SHUTDOWN);
 
 	kvm_x86_state_cleanup(state);
-	kvm_vm_free(vm);
-	return 0;
+}
+
+KVM_ONE_VCPU_TEST_SUITE(vmcb12_gpa);
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmrun_invalid, l1_vmrun)
+{
+	test_invalid_vmcb12(vcpu);
+}
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmload_invalid, l1_vmload)
+{
+	test_invalid_vmcb12(vcpu);
+}
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmsave_invalid, l1_vmsave)
+{
+	test_invalid_vmcb12(vcpu);
+}
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmrun_unmappable, l1_vmrun)
+{
+	test_unmappable_vmcb12(vcpu);
+}
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmload_unmappable, l1_vmload)
+{
+	test_unmappable_vmcb12(vcpu);
+}
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmsave_unmappable, l1_vmsave)
+{
+	test_unmappable_vmcb12(vcpu);
+}
+
+/*
+ * Invalid vmcb12_gpa cannot be test for #VMEXIT as KVM_SET_NESTED_STATE will
+ * reject it.
+ */
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmexit_unmappable, l1_vmexit)
+{
+	test_unmappable_vmcb12_vmexit(vcpu);
+}
+
+int main(int argc, char *argv[])
+{
+	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SVM));
+
+	return test_harness_run(argc, argv);
 }
-- 
2.53.0.851.ga537e3e6e9-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 7/7] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name
  2026-03-13  0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
                   ` (5 preceding siblings ...)
  2026-03-13  0:10 ` [PATCH v3 6/7] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa Yosry Ahmed
@ 2026-03-13  0:10 ` Yosry Ahmed
  6 siblings, 0 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-13  0:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

The test checks both invalid GPAs as well as unmappable GPAs, so drop
'invalid' from its name.

Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 tools/testing/selftests/kvm/Makefile.kvm                        | 2 +-
 ...{svm_nested_invalid_vmcb12_gpa.c => svm_nested_vmcb12_gpa.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename tools/testing/selftests/kvm/x86/{svm_nested_invalid_vmcb12_gpa.c => svm_nested_vmcb12_gpa.c} (100%)

diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
index 3d372d78a2756..02fa13d4ad4f1 100644
--- a/tools/testing/selftests/kvm/Makefile.kvm
+++ b/tools/testing/selftests/kvm/Makefile.kvm
@@ -112,9 +112,9 @@ TEST_GEN_PROGS_x86 += x86/vmx_preemption_timer_test
 TEST_GEN_PROGS_x86 += x86/svm_vmcall_test
 TEST_GEN_PROGS_x86 += x86/svm_int_ctl_test
 TEST_GEN_PROGS_x86 += x86/svm_nested_clear_efer_svme
-TEST_GEN_PROGS_x86 += x86/svm_nested_invalid_vmcb12_gpa
 TEST_GEN_PROGS_x86 += x86/svm_nested_shutdown_test
 TEST_GEN_PROGS_x86 += x86/svm_nested_soft_inject_test
+TEST_GEN_PROGS_x86 += x86/svm_nested_vmcb12_gpa
 TEST_GEN_PROGS_x86 += x86/svm_lbr_nested_state
 TEST_GEN_PROGS_x86 += x86/tsc_scaling_sync
 TEST_GEN_PROGS_x86 += x86/sync_regs_test
diff --git a/tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c b/tools/testing/selftests/kvm/x86/svm_nested_vmcb12_gpa.c
similarity index 100%
rename from tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c
rename to tools/testing/selftests/kvm/x86/svm_nested_vmcb12_gpa.c
-- 
2.53.0.851.ga537e3e6e9-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers
  2026-03-13  0:10 ` [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers Yosry Ahmed
@ 2026-03-13 18:17   ` Yosry Ahmed
  2026-03-13 22:44     ` Sean Christopherson
  2026-03-16 15:25     ` Yosry Ahmed
  0 siblings, 2 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-13 18:17 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel

> @@ -2306,24 +2312,18 @@ static int gp_interception(struct kvm_vcpu *vcpu)
>                 goto reinject;
>
>         opcode = svm_instr_opcode(vcpu);
> +       if (opcode != NONE_SVM_INSTR)
> +               return emulate_svm_instr(vcpu, opcode);
>
> -       if (opcode == NONE_SVM_INSTR) {
> -               if (!enable_vmware_backdoor)
> -                       goto reinject;
> -
> -               /*
> -                * VMware backdoor emulation on #GP interception only handles
> -                * IN{S}, OUT{S}, and RDPMC.
> -                */
> -               if (!is_guest_mode(vcpu))
> -                       return kvm_emulate_instruction(vcpu,
> -                               EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
> -       } else {
> -               if (!page_address_valid(vcpu, svm->vmcb->save.rax))
> -                       goto reinject;
> +       if (!enable_vmware_backdoor)
> +               goto reinject;
>
> -               return emulate_svm_instr(vcpu, opcode);
> -       }
> +       /*
> +        * VMware backdoor emulation on #GP interception only handles
> +        * IN{S}, OUT{S}, and RDPMC.
> +        */
> +       if (!is_guest_mode(vcpu))
> +               return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);

AI review pointed out that we should not drop the page_address_valid()
from here, because if an SVM instruction is executed by L2, and KVM
intercepts the #GP, it should re-inject the #GP into L2 if RAX is
illegal instead of synthesizing a #VMEXIT to L1. My initial instinct
is to keep the check here as well as in the intercept handlers, but
no, L1's intercept should take precedence over #GP due to invalid RAX
anyway. In fact, if L1 has the intercept set, then it must be set in
vmcb02, and KVM would get a #VMEXIT on the intercept not on #GP.

The actual problem is that the current code does not check if L1
actually sets the intercept in emulate_svm_instr(). So if L1 and KVM
do not set the intercept, and RAX is invalid, the current code could
synthesize a spurious #VMEXIT to L1 instead of reinjecting #GP. The
existing check on RAX prevents that, but it doesn't really fix the
problem because if we get #GP due to CPL != 0, we'll still generate a
spurious #VMEXIT to L1. What we really should be doing in
gp_interception() is:

1. if CPL != 0, re-inject #GP.
2. If in guest mode and L1 intercepts the instruction, synthesize a #VMEXIT.
3. Otherwise emulate the instruction, which would take care of
re-injecting the #GP if RAX is invalid with this patch.

Something like this on top (over 2 patches):

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index cf5ebdc4b27bf..8942272eb80b2 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2237,10 +2237,11 @@ static int emulate_svm_instr(struct kvm_vcpu
*vcpu, int opcode)
                [SVM_INSTR_VMLOAD] = vmload_interception,
                [SVM_INSTR_VMSAVE] = vmsave_interception,
        };
+       int exit_code = guest_mode_exit_codes[opcode];
        struct vcpu_svm *svm = to_svm(vcpu);

-       if (is_guest_mode(vcpu)) {
-               nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
+       if (is_guest_mode(vcpu) &&
vmcb12_is_intercept(&svm->nested.ctl, exit_code))
+               nested_svm_simple_vmexit(svm, exit_code);
                return 1;
        }
        return svm_instr_handlers[opcode](vcpu);
@@ -2269,8 +2270,11 @@ static int gp_interception(struct kvm_vcpu *vcpu)
                goto reinject;

        opcode = svm_instr_opcode(vcpu);
-       if (opcode != NONE_SVM_INSTR)
+       if (opcode != NONE_SVM_INSTR) {
+               if (svm->vmcb->save.cpl)
+                       goto reinject;
                return emulate_svm_instr(vcpu, opcode);
+       }

        if (!enable_vmware_backdoor)
                goto reinject;

---

Sean, do you prefer that I send patches separately on top of this
series or a new version with these patches included?

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers
  2026-03-13 18:17   ` Yosry Ahmed
@ 2026-03-13 22:44     ` Sean Christopherson
  2026-03-13 23:08       ` Yosry Ahmed
  2026-03-16 15:25     ` Yosry Ahmed
  1 sibling, 1 reply; 18+ messages in thread
From: Sean Christopherson @ 2026-03-13 22:44 UTC (permalink / raw)
  To: Yosry Ahmed; +Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel

On Fri, Mar 13, 2026, Yosry Ahmed wrote:
> > @@ -2306,24 +2312,18 @@ static int gp_interception(struct kvm_vcpu *vcpu)
> >                 goto reinject;
> >
> >         opcode = svm_instr_opcode(vcpu);
> > +       if (opcode != NONE_SVM_INSTR)
> > +               return emulate_svm_instr(vcpu, opcode);
> >
> > -       if (opcode == NONE_SVM_INSTR) {
> > -               if (!enable_vmware_backdoor)
> > -                       goto reinject;
> > -
> > -               /*
> > -                * VMware backdoor emulation on #GP interception only handles
> > -                * IN{S}, OUT{S}, and RDPMC.
> > -                */
> > -               if (!is_guest_mode(vcpu))
> > -                       return kvm_emulate_instruction(vcpu,
> > -                               EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
> > -       } else {
> > -               if (!page_address_valid(vcpu, svm->vmcb->save.rax))
> > -                       goto reinject;
> > +       if (!enable_vmware_backdoor)
> > +               goto reinject;
> >
> > -               return emulate_svm_instr(vcpu, opcode);
> > -       }
> > +       /*
> > +        * VMware backdoor emulation on #GP interception only handles
> > +        * IN{S}, OUT{S}, and RDPMC.
> > +        */
> > +       if (!is_guest_mode(vcpu))
> > +               return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
> 
> AI review pointed out that we should not drop the page_address_valid()
> from here, because if an SVM instruction is executed by L2, and KVM
> intercepts the #GP, it should re-inject the #GP into L2 if RAX is
> illegal instead of synthesizing a #VMEXIT to L1. 

No, because the intercept has higher priority than the #GP due to bad RAX.

> My initial instincth is to keep the check here as well as in the intercept
> handlers, but no, L1's intercept should take precedence over #GP due to
> invalid RAX anyway. In fact, if L1 has the intercept set, then it must be set
> in vmcb02, and KVM would get a #VMEXIT on the intercept not on #GP.

Except for the erratum case.

> The actual problem is that the current code does not check if L1
> actually sets the intercept in emulate_svm_instr(). 

Oh dagnabbit.  I had thought about this, multiple times, but wrote it off as a
non-issue because if L1 wanted to intercept VMWHATEVER, KVM would set the intercept
in vmcb02 and would get _that_ instead of a #GP.  But the erratum case means that
hardware could have signaled #GP even when the instruction should have been
intercepted.

And I also forgot the KVM could be intercepting #GP for the VMware crud, which
would unintentionally grab the CPL case too.  Darn kitchen sink #GPs.

> So if L1 and KVM do not set the intercept, and RAX is invalid, the current
> code could synthesize a spurious #VMEXIT to L1 instead of reinjecting #GP.
> The existing check on RAX prevents that, but it doesn't really fix the
> problem because if we get #GP due to CPL != 0, we'll still generate a
> spurious #VMEXIT to L1. What we really should be doing in gp_interception()
> is:
> 
> 1. if CPL != 0, re-inject #GP.
> 2. If in guest mode and L1 intercepts the instruction, synthesize a #VMEXIT.
> 3. Otherwise emulate the instruction, which would take care of
> re-injecting the #GP if RAX is invalid with this patch.
> 
> Something like this on top (over 2 patches):
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index cf5ebdc4b27bf..8942272eb80b2 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2237,10 +2237,11 @@ static int emulate_svm_instr(struct kvm_vcpu
> *vcpu, int opcode)
>                 [SVM_INSTR_VMLOAD] = vmload_interception,
>                 [SVM_INSTR_VMSAVE] = vmsave_interception,
>         };
> +       int exit_code = guest_mode_exit_codes[opcode];
>         struct vcpu_svm *svm = to_svm(vcpu);
> 
> -       if (is_guest_mode(vcpu)) {
> -               nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
> +       if (is_guest_mode(vcpu) &&
> vmcb12_is_intercept(&svm->nested.ctl, exit_code))
> +               nested_svm_simple_vmexit(svm, exit_code);
>                 return 1;
>         }
>         return svm_instr_handlers[opcode](vcpu);
> @@ -2269,8 +2270,11 @@ static int gp_interception(struct kvm_vcpu *vcpu)
>                 goto reinject;
> 
>         opcode = svm_instr_opcode(vcpu);
> -       if (opcode != NONE_SVM_INSTR)
> +       if (opcode != NONE_SVM_INSTR) {
> +               if (svm->vmcb->save.cpl)
> +                       goto reinject;

Don't you need the page_address_valid() check here?  Ooooh, no, because either
emulate_svm_instr() will synthesize #VMEXIT, or svm_instr_handlers() will take
care of the #GP.  It's only CPL that needs to be checked early, because it has
priority over the #VMEXIT.

>                 return emulate_svm_instr(vcpu, opcode);
> +       }
> 
>         if (!enable_vmware_backdoor)
>                 goto reinject;
> 
> ---
> 
> Sean, do you prefer that I send patches separately on top of this
> series or a new version with these patches included?

Go ahead and send an entirely new series.  The less threads I have to chase down
after I get back, the less likely I am to screw things up :-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers
  2026-03-13 22:44     ` Sean Christopherson
@ 2026-03-13 23:08       ` Yosry Ahmed
  0 siblings, 0 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-13 23:08 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel

> > > +       /*
> > > +        * VMware backdoor emulation on #GP interception only handles
> > > +        * IN{S}, OUT{S}, and RDPMC.
> > > +        */
> > > +       if (!is_guest_mode(vcpu))
> > > +               return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
> > 
> > AI review pointed out that we should not drop the page_address_valid()
> > from here, because if an SVM instruction is executed by L2, and KVM
> > intercepts the #GP, it should re-inject the #GP into L2 if RAX is
> > illegal instead of synthesizing a #VMEXIT to L1. 
> 
> No, because the intercept has higher priority than the #GP due to bad RAX.

Which is literally what I say next :P

> 
> > My initial instincth is to keep the check here as well as in the intercept
> > handlers, but no, L1's intercept should take precedence over #GP due to
> > invalid RAX anyway. In fact, if L1 has the intercept set, then it must be set
> > in vmcb02, and KVM would get a #VMEXIT on the intercept not on #GP.
> 
> Except for the erratum case.

Yes.

> 
> > The actual problem is that the current code does not check if L1
> > actually sets the intercept in emulate_svm_instr(). 
> 
> Oh dagnabbit.  I had thought about this, multiple times, but wrote it off as a
> non-issue because if L1 wanted to intercept VMWHATEVER, KVM would set the intercept
> in vmcb02 and would get _that_ instead of a #GP. But the erratum case means that
> hardware could have signaled #GP even when the instruction should have been
> intercepted.

The problem is actually the other way around, it's when L1 does not want
to intercept it. So I think it's a problem regardless of the erratum.

> And I also forgot the KVM could be intercepting #GP for the VMware crud, which
> would unintentionally grab the CPL case too.  Darn kitchen sink #GPs.
> 
> > So if L1 and KVM do not set the intercept, and RAX is invalid, the current
> > code could synthesize a spurious #VMEXIT to L1 instead of reinjecting #GP.
> > The existing check on RAX prevents that, but it doesn't really fix the
> > problem because if we get #GP due to CPL != 0, we'll still generate a
> > spurious #VMEXIT to L1. What we really should be doing in gp_interception()
> > is:
> > 
> > 1. if CPL != 0, re-inject #GP.
> > 2. If in guest mode and L1 intercepts the instruction, synthesize a #VMEXIT.
> > 3. Otherwise emulate the instruction, which would take care of
> > re-injecting the #GP if RAX is invalid with this patch.
> > 
> > Something like this on top (over 2 patches):
> > 
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index cf5ebdc4b27bf..8942272eb80b2 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -2237,10 +2237,11 @@ static int emulate_svm_instr(struct kvm_vcpu
> > *vcpu, int opcode)
> >                 [SVM_INSTR_VMLOAD] = vmload_interception,
> >                 [SVM_INSTR_VMSAVE] = vmsave_interception,
> >         };
> > +       int exit_code = guest_mode_exit_codes[opcode];
> >         struct vcpu_svm *svm = to_svm(vcpu);
> > 
> > -       if (is_guest_mode(vcpu)) {
> > -               nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
> > +       if (is_guest_mode(vcpu) &&
> > vmcb12_is_intercept(&svm->nested.ctl, exit_code))
> > +               nested_svm_simple_vmexit(svm, exit_code);
> >                 return 1;
> >         }
> >         return svm_instr_handlers[opcode](vcpu);
> > @@ -2269,8 +2270,11 @@ static int gp_interception(struct kvm_vcpu *vcpu)
> >                 goto reinject;
> > 
> >         opcode = svm_instr_opcode(vcpu);
> > -       if (opcode != NONE_SVM_INSTR)
> > +       if (opcode != NONE_SVM_INSTR) {
> > +               if (svm->vmcb->save.cpl)
> > +                       goto reinject;
> 
> Don't you need the page_address_valid() check here?  Ooooh, no, because either
> emulate_svm_instr() will synthesize #VMEXIT, or svm_instr_handlers() will take
> care of the #GP.  It's only CPL that needs to be checked early, because it has
> priority over the #VMEXIT.

Yeah, exactly my thought process.

> 
> >                 return emulate_svm_instr(vcpu, opcode);
> > +       }
> > 
> >         if (!enable_vmware_backdoor)
> >                 goto reinject;
> > 
> > ---
> > 
> > Sean, do you prefer that I send patches separately on top of this
> > series or a new version with these patches included?
> 
> Go ahead and send an entirely new series.  The less threads I have to chase down
> after I get back, the less likely I am to screw things up :-)

I will send one next week.

I might also add a patch at the end up cleaning up all of this
svm_instr_opcode() and emulate_svm_instr() stuff. The code is
unnecessarily convoluted, we get the opcode in one place then key off of
it in another.

I think it would be nicer with a single helper to handle SVM
instructions, and that would create a good spot to add a comment about
precedence ordering. Something like this:

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a0dacbeaa3c5a..d5afcb179398b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2235,54 +2235,42 @@ static int vmrun_interception(struct kvm_vcpu *vcpu)
 	return nested_svm_vmrun(vcpu);
 }
 
-enum {
-	NONE_SVM_INSTR,
-	SVM_INSTR_VMRUN,
-	SVM_INSTR_VMLOAD,
-	SVM_INSTR_VMSAVE,
-};
-
-/* Return NONE_SVM_INSTR if not SVM instrs, otherwise return decode result */
-static int svm_instr_opcode(struct kvm_vcpu *vcpu)
+static bool check_emulate_svm_instr(struct kvm_vcpu *vcpu, int *ret)
 {
 	struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
+	int exit_code;
 
 	if (ctxt->b != 0x1 || ctxt->opcode_len != 2)
-		return NONE_SVM_INSTR;
+		return false;
 
 	switch (ctxt->modrm) {
 	case 0xd8: /* VMRUN */
-		return SVM_INSTR_VMRUN;
+		exit_code = SVM_EXIT_VMRUN;
+		break;
 	case 0xda: /* VMLOAD */
-		return SVM_INSTR_VMLOAD;
+		exit_code = SVM_EXIT_VMLOAD;
+		break;
 	case 0xdb: /* VMSAVE */
-		return SVM_INSTR_VMSAVE;
-	default:
+		exit_code = SVM_EXIT_VMSAVE;
 		break;
+	default:
+		return false;
 	}
 
-	return NONE_SVM_INSTR;
-}
-
-static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
-{
-	const int guest_mode_exit_codes[] = {
-		[SVM_INSTR_VMRUN] = SVM_EXIT_VMRUN,
-		[SVM_INSTR_VMLOAD] = SVM_EXIT_VMLOAD,
-		[SVM_INSTR_VMSAVE] = SVM_EXIT_VMSAVE,
-	};
-	int (*const svm_instr_handlers[])(struct kvm_vcpu *vcpu) = {
-		[SVM_INSTR_VMRUN] = vmrun_interception,
-		[SVM_INSTR_VMLOAD] = vmload_interception,
-		[SVM_INSTR_VMSAVE] = vmsave_interception,
-	};
-	struct vcpu_svm *svm = to_svm(vcpu);
+	/*
+	 * #GP due to CPL != 0 takes precedence over intercepts, but intercepts
+	 * take precedence over #GP due to invalid RAX (which is checked by the
+	 * exit handlers).
+	 */
+	*ret = 1;
+	if (to_svm(vcpu)->vmcb->save.cpl)
+		kvm_inject_gp(vcpu, 0);
+	else if (is_guest_mode(vcpu) && vmcb12_is_intercept(&svm->nested.ctl, exit_code))
+		nested_svm_simple_vmexit(svm, exit_code);
+	else
+		*ret = svm_invoke_exit_handler(vcpu, exit_code);
 
-	if (is_guest_mode(vcpu)) {
-		nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
-		return 1;
-	}
-	return svm_instr_handlers[opcode](vcpu);
+	return true;
 }
 
 /*
@@ -2297,7 +2285,7 @@ static int gp_interception(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	u32 error_code = svm->vmcb->control.exit_info_1;
-	int opcode;
+	int r;
 
 	/* Both #GP cases have zero error_code */
 	if (error_code)
@@ -2307,9 +2295,8 @@ static int gp_interception(struct kvm_vcpu *vcpu)
 	if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)
 		goto reinject;
 
-	opcode = svm_instr_opcode(vcpu);
-	if (opcode != NONE_SVM_INSTR)
-		return emulate_svm_instr(vcpu, opcode);
+	if (check_emulate_svm_instr(vcpu, &r))
+		return r;
 
 	if (!enable_vmware_backdoor)
 		goto reinject;

---

The only thing I am unsure of is whether to check if it's an SVM
instruction in a separate helper to avoid the output parameter.

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
  2026-03-13  0:10 ` [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator Yosry Ahmed
@ 2026-03-15 12:55   ` Paolo Bonzini
  2026-03-16 13:49     ` Yosry Ahmed
  0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2026-03-15 12:55 UTC (permalink / raw)
  To: Yosry Ahmed, Sean Christopherson; +Cc: Jim Mattson, kvm, linux-kernel

On 3/13/26 01:10, Yosry Ahmed wrote:
> Outside of forced emulation or code stream rewriting,

But isn't that the point? Due to code stream rewriting or intentional 
usage of stale TLBs (so that the processor executes one instruction and 
the emulator another), the emulator cannot assume that it will "never be 
invoked with an actually illegal RAX".

I realize that I'm late to the show, so I apologize in advance if this 
has been discussed before.

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
  2026-03-15 12:55   ` Paolo Bonzini
@ 2026-03-16 13:49     ` Yosry Ahmed
  2026-03-16 16:28       ` Yosry Ahmed
  2026-03-17 13:15       ` Paolo Bonzini
  0 siblings, 2 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-16 13:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Jim Mattson, kvm, linux-kernel

On Sun, Mar 15, 2026 at 5:55 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 3/13/26 01:10, Yosry Ahmed wrote:
> > Outside of forced emulation or code stream rewriting,
>
> But isn't that the point? Due to code stream rewriting or intentional
> usage of stale TLBs (so that the processor executes one instruction and
> the emulator another), the emulator cannot assume that it will "never be
> invoked with an actually illegal RAX".
>
> I realize that I'm late to the show, so I apologize in advance if this
> has been discussed before.

Thanks for chiming in. FWIW, I initially intended to fix this check
instead of removing it, the removal came after a discussion with Sean,
see https://lore.kernel.org/kvm/abH0RdnM29Xyh_4G@google.com for more
context.

TL;DR is that the emulator support for VMRUN/VMLOAD/VMSAVE is
broken/unsupported anyway, beyond checking for intercepts and
pre-intercept exceptions (well, even that is broken), and the RAX
check should be after that architecturally.

The emulator isn't wired up to actually emulate these instructions.
Also, it performs the #GP on CPL check before the #UD on EFER.SVME
check, which is wrong. The right thing to do for the illegal RAX check
would be to move it after the intercept check, but we don't have an
execute() callback to put it in. I think what we really should do is
actually wire up the emulator to call into the intercept handlers for
VMRUN/VMLOAD/VMSAVE, such that it actually supports these
instructions, and the RAX check is done in this intercept handlers (or
at least will be after this series).

Given that the goal of this series is actually to stop injecting a
non-architectural #GP when KVM fails to read vmcb12, I didn't really
want to derail it anymore by making it emulator-focused. We can
probably do a follow-up on just improving the emulator support for
these instructions as I mentioned above, as that would take its own
series AFAICT.

The only reason this patch exists here is because the buggy RAX check
in the emulator results in injecting a non-architectural #GP on 52-bit
MAXPHYADDR platforms (the #NPF scenario described in the changelog).
So the check is blocking the goal of the series. I didn't really care
much whether we fixed it or removed it, but Sean did not want to fix
it and have the emulator directly peek into vCPU state, so removal it
was.

I hope this makes things clearer.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers
  2026-03-13 18:17   ` Yosry Ahmed
  2026-03-13 22:44     ` Sean Christopherson
@ 2026-03-16 15:25     ` Yosry Ahmed
  1 sibling, 0 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-16 15:25 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel

> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index cf5ebdc4b27bf..8942272eb80b2 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2237,10 +2237,11 @@ static int emulate_svm_instr(struct kvm_vcpu
> *vcpu, int opcode)
>                 [SVM_INSTR_VMLOAD] = vmload_interception,
>                 [SVM_INSTR_VMSAVE] = vmsave_interception,
>         };
> +       int exit_code = guest_mode_exit_codes[opcode];
>         struct vcpu_svm *svm = to_svm(vcpu);
> 
> -       if (is_guest_mode(vcpu)) {
> -               nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
> +       if (is_guest_mode(vcpu) &&
> vmcb12_is_intercept(&svm->nested.ctl, exit_code))
> +               nested_svm_simple_vmexit(svm, exit_code);

No, this is wrong.. well it's incomplete. So we do need to check the
intercept in vmcb12, but, if it's not set, we'll end up with KVM
emulating the instructions through vmload_vmsave_interception(), and
treating RAX as an L1 GPA.

If L1 has VLS enabled though, this is wrong. KVM should treat RAX as an
L2 GPA an run it through the NPT first before using it (e.g. through
translate_nested_gpa()).

Synthesizing a spurious #VMEXIT(VMLOAD/VMSAVE) is definitely better than
letting L2 bypass L1's NPTs and access its memory. So this change is a
net loss. I will drop it from the next version, and this spurious
#VMEXIT can be fixed separately to keep this series focused on fixing
the non-architectural #GPs.

>                 return 1;
>         }
>         return svm_instr_handlers[opcode](vcpu);
> @@ -2269,8 +2270,11 @@ static int gp_interception(struct kvm_vcpu *vcpu)
>                 goto reinject;
> 
>         opcode = svm_instr_opcode(vcpu);
> -       if (opcode != NONE_SVM_INSTR)
> +       if (opcode != NONE_SVM_INSTR) {
> +               if (svm->vmcb->save.cpl)
> +                       goto reinject;
>                 return emulate_svm_instr(vcpu, opcode);
> +       }
> 
>         if (!enable_vmware_backdoor)
>                 goto reinject;
> 
> ---
> 
> Sean, do you prefer that I send patches separately on top of this
> series or a new version with these patches included?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
  2026-03-16 13:49     ` Yosry Ahmed
@ 2026-03-16 16:28       ` Yosry Ahmed
  2026-03-17 13:15       ` Paolo Bonzini
  1 sibling, 0 replies; 18+ messages in thread
From: Yosry Ahmed @ 2026-03-16 16:28 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Jim Mattson, kvm, linux-kernel

On Mon, Mar 16, 2026 at 06:49:35AM -0700, Yosry Ahmed wrote:
> On Sun, Mar 15, 2026 at 5:55 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > On 3/13/26 01:10, Yosry Ahmed wrote:
> > > Outside of forced emulation or code stream rewriting,
> >
> > But isn't that the point? Due to code stream rewriting or intentional
> > usage of stale TLBs (so that the processor executes one instruction and
> > the emulator another), the emulator cannot assume that it will "never be
> > invoked with an actually illegal RAX".
> >
> > I realize that I'm late to the show, so I apologize in advance if this
> > has been discussed before.
> 
> Thanks for chiming in. FWIW, I initially intended to fix this check
> instead of removing it, the removal came after a discussion with Sean,
> see https://lore.kernel.org/kvm/abH0RdnM29Xyh_4G@google.com for more
> context.
> 
> TL;DR is that the emulator support for VMRUN/VMLOAD/VMSAVE is
> broken/unsupported anyway, beyond checking for intercepts and
> pre-intercept exceptions (well, even that is broken), and the RAX
> check should be after that architecturally.

No, this is wrong. I believe Sean's read of the APM was incomplete, he
quoted this part (which I did include in my changelog):

  Generally, instruction intercepts are checked after simple exceptions
  (such as #GP—when CPL is incorrect—or #UD) have been checked, but
  before exceptions related to memory accesses (such as page faults) and
  exceptions based on specific operand values.

But in table 15-7 for instruction intercepts, the rows for
VMRUN/VMLOAD/VMSAVE have this in the priority column:

  Checks exceptions (#GP) before the intercept.

Unlike other rows that specify #GP on CPL != 0.

Additionally, in the VMRUN pseudocode, we have this:

  IF ((MSR_EFER.SVME == 0) || (!PROTECTED_MODE))
    EXCEPTION [#UD] // This instruction can only be executed in
                       protected mode with SVM enabled
  IF (CPL != 0) // This instruction is only allowed at CPL 0
    EXCEPTION [#GP]
  IF (rAX contains an unsupported physical address)
    EXCEPTION [#GP]
  IF (intercepted(VMRUN))
    #VMEXIT (VMRUN)

The pseudocode for VMLOAD/VMSAVE does not have the intercepted() check
though, but I assume it's the save as VMRUN. I did confirm that with
vls=0, vmload_interception() is not being called on VMLOAD if RAX=-1ULL.

So I think the RAX check is actually intended to happen before the
intercept. I think I will go back to fixing the RAX check instead of
dropping it. 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
  2026-03-16 13:49     ` Yosry Ahmed
  2026-03-16 16:28       ` Yosry Ahmed
@ 2026-03-17 13:15       ` Paolo Bonzini
  2026-03-17 14:58         ` Jim Mattson
  1 sibling, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2026-03-17 13:15 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Sean Christopherson, Jim Mattson, kvm, Kernel Mailing List, Linux

Il lun 16 mar 2026, 14:49 Yosry Ahmed <yosry@kernel.org> ha scritto:
>
> TL;DR is that the emulator support for VMRUN/VMLOAD/VMSAVE is
> broken/unsupported anyway, beyond checking for intercepts and
> pre-intercept exceptions (well, even that is broken), and the RAX
> check should be after that architecturally.

Agreed about that, just ripping it out completely would be fine. It's
only if the emulation happens that it has to be complete.

> Given that the goal of this series is actually to stop injecting a
> non-architectural #GP when KVM fails to read vmcb12, I didn't really
> want to derail it anymore by making it emulator-focused. We can
> probably do a follow-up on just improving the emulator support for
> these instructions as I mentioned above, as that would take its own
> series AFAICT.

Agreed.

> The only reason this patch exists here is because the buggy RAX check
> in the emulator results in injecting a non-architectural #GP on 52-bit
> MAXPHYADDR platforms (the #NPF scenario described in the changelog).

Yeah, that part was clearly broken when physical address bits broke
AMD's 48-bit limit (and I don't think the details of what is
considered an invalid physical address are part of the architectural
description, for example some SMM ranges are carved out by the
VMRUN/VMLOAD/VMSAVE microcode).

Thanks,

Paolo

> So the check is blocking the goal of the series. I didn't really care
> much whether we fixed it or removed it, but Sean did not want to fix
> it and have the emulator directly peek into vCPU state, so removal it
> was.
>
> I hope this makes things clearer.
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
  2026-03-17 13:15       ` Paolo Bonzini
@ 2026-03-17 14:58         ` Jim Mattson
  2026-03-18 15:55           ` Paolo Bonzini
  0 siblings, 1 reply; 18+ messages in thread
From: Jim Mattson @ 2026-03-17 14:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Yosry Ahmed, Sean Christopherson, kvm, Kernel Mailing List, Linux

On Tue, Mar 17, 2026 at 6:15 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> Il lun 16 mar 2026, 14:49 Yosry Ahmed <yosry@kernel.org> ha scritto:
> >
> > TL;DR is that the emulator support for VMRUN/VMLOAD/VMSAVE is
> > broken/unsupported anyway, beyond checking for intercepts and
> > pre-intercept exceptions (well, even that is broken), and the RAX
> > check should be after that architecturally.
>
> Agreed about that, just ripping it out completely would be fine. It's
> only if the emulation happens that it has to be complete.
>
> > Given that the goal of this series is actually to stop injecting a
> > non-architectural #GP when KVM fails to read vmcb12, I didn't really
> > want to derail it anymore by making it emulator-focused. We can
> > probably do a follow-up on just improving the emulator support for
> > these instructions as I mentioned above, as that would take its own
> > series AFAICT.
>
> Agreed.
>
> > The only reason this patch exists here is because the buggy RAX check
> > in the emulator results in injecting a non-architectural #GP on 52-bit
> > MAXPHYADDR platforms (the #NPF scenario described in the changelog).
>
> Yeah, that part was clearly broken when physical address bits broke
> AMD's 48-bit limit (and I don't think the details of what is
> considered an invalid physical address are part of the architectural
> description, for example some SMM ranges are carved out by the
> VMRUN/VMLOAD/VMSAVE microcode).

IIUC, VLS is broken because it applies the host's microarchitectural
physical address restrictions to untranslated *guest* physical
addresses. Is that right?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
  2026-03-17 14:58         ` Jim Mattson
@ 2026-03-18 15:55           ` Paolo Bonzini
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2026-03-18 15:55 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yosry Ahmed, Sean Christopherson, kvm, Kernel Mailing List, Linux

On Tue, Mar 17, 2026 at 3:58 PM Jim Mattson <jmattson@google.com> wrote:
> > Yeah, that part was clearly broken when physical address bits broke
> > AMD's 48-bit limit (and I don't think the details of what is
> > considered an invalid physical address are part of the architectural
> > description, for example some SMM ranges are carved out by the
> > VMRUN/VMLOAD/VMSAVE microcode).
>
> IIUC, VLS is broken because it applies the host's microarchitectural
> physical address restrictions to untranslated *guest* physical
> addresses. Is that right?

Yes, look at gp_interception() in svm.c. I'm not sure if AMD ever said
publicly whether it was an erratum, or intentional even with VLS
enabled (https://lkml.org/lkml/2021/1/13/513).

(Also I hope that, with VLS enabled, the check is performed *again*
after translating the address).

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2026-03-18 15:55 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-13  0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator Yosry Ahmed
2026-03-15 12:55   ` Paolo Bonzini
2026-03-16 13:49     ` Yosry Ahmed
2026-03-16 16:28       ` Yosry Ahmed
2026-03-17 13:15       ` Paolo Bonzini
2026-03-17 14:58         ` Jim Mattson
2026-03-18 15:55           ` Paolo Bonzini
2026-03-13  0:10 ` [PATCH v3 2/7] KVM: SVM: Check that RAX has legal GPA on #GP interception of SVM insns Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers Yosry Ahmed
2026-03-13 18:17   ` Yosry Ahmed
2026-03-13 22:44     ` Sean Christopherson
2026-03-13 23:08       ` Yosry Ahmed
2026-03-16 15:25     ` Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 4/7] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 5/7] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 6/7] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 7/7] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name Yosry Ahmed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox