public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] KVM: nSVM: Fix vmcb12 mapping failure handling
@ 2026-03-06 21:08 Yosry Ahmed
  2026-03-06 21:08 ` [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE Yosry Ahmed
                   ` (5 more replies)
  0 siblings, 6 replies; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-06 21:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

Jim pointed out that VMRUN/VMLOAD/VMSAVE injecting a #GP when the vmcb12
GPA is valid but not mappable is not architectural [1]. The series
handles them as emulation failures and (mostly) exits to userspace
instead.

With vls=1, a VMSAVE/VMLOAD with an unmappable GPA will cause a #NPF and
be emulated. The emulator currently hardcodes the GPA check to 48 valid
bits and injects a #GP otherwise. Fix this to only inject a #GP if the
GPA actually exceeds maxphyaddr, and otherwise fail the emulation as
well.

Rework svm_nested_invalid_vmcb12_gpa to fix the fact that it's currently
testing #GP on VMLOAD instead of VMRUN, and extend it to test all of
VMRUN, VMLOAD, and VMSAVE in both cases of GPA > maxphyaddr and GPA <
maxphyaddr but unmappable. Finally rename it to make its name a bit more
generic and representative.

This is not strictly a v2, but it supersedes the series at [2].

[1]https://lore.kernel.org/kvm/CALMp9eSMtzDJn7tGtbj=zLYpcU7Tc7XjcWBRZH7Aa5YihSmN7g@mail.gmail.com/
[2]https://lore.kernel.org/kvm/20260305203005.1021335-1-yosry@kernel.org/

Yosry Ahmed (6):
  KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  KVM: nSVM: Simplify error handling of
    nested_svm_copy_vmcb12_to_cache()
  KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation
  KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12
    fails
  KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa
  KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's
    name

 arch/x86/kvm/emulate.c                        |   3 +-
 arch/x86/kvm/svm/nested.c                     |  20 +-
 arch/x86/kvm/svm/svm.c                        |   8 +-
 tools/testing/selftests/kvm/Makefile.kvm      |   2 +-
 .../testing/selftests/kvm/include/kvm_util.h  |   1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    |   6 +
 .../kvm/x86/svm_nested_invalid_vmcb12_gpa.c   |  98 ----------
 .../selftests/kvm/x86/svm_nested_vmcb12_gpa.c | 179 ++++++++++++++++++
 8 files changed, 200 insertions(+), 117 deletions(-)
 delete mode 100644 tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c
 create mode 100644 tools/testing/selftests/kvm/x86/svm_nested_vmcb12_gpa.c


base-commit: 5128b972fb2801ad9aca54d990a75611ab5283a9
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-06 21:08 [PATCH v2 0/6] KVM: nSVM: Fix vmcb12 mapping failure handling Yosry Ahmed
@ 2026-03-06 21:08 ` Yosry Ahmed
  2026-03-06 22:27   ` Jim Mattson
  2026-03-06 21:08 ` [PATCH v2 2/6] KVM: nSVM: Simplify error handling of nested_svm_copy_vmcb12_to_cache() Yosry Ahmed
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-06 21:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

Architecturally, VMRUN/VMLOAD/VMSAVE should generate a #GP if the
physical address in RAX is not supported. check_svme_pa() hardcodes this
to checking that bits 63-48 are not set. This is incorrect on HW
supporting 52 bits of physical address space, so use maxphyaddr instead.

Note that the host's maxphyaddr is used, not the guest, because the
emulator path for VMLOAD/VMSAVE is generally used when virtual
VMLOAD/VMSAVE is enabled AND a #NPF is generated. If a #NPF is not
generated, the CPU will inject a #GP based on the host's maxphyaddr.  So
this keeps the behavior consistent.

If KVM wants to consistently inject a #GP based on the guest's
maxphyaddr, it would need to disabled virtual VMLOAD/VMSAVE and
intercept all VMLOAD/VMSAVE instructions to do the check.

Also, emulating a smaller maxphyaddr for the guest than the host
generally doesn't work well, so it's not worth handling this.

Fixes: 01de8b09e606 ("KVM: SVM: Add intercept checks for SVM instructions")
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/emulate.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 6145dac4a605a..9ea2584dda912 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3887,8 +3887,7 @@ static int check_svme_pa(struct x86_emulate_ctxt *ctxt)
 {
 	u64 rax = reg_read(ctxt, VCPU_REGS_RAX);
 
-	/* Valid physical address? */
-	if (rax & 0xffff000000000000ULL)
+	if (rax & rsvd_bits(kvm_host.maxphyaddr, 63))
 		return emulate_gp(ctxt, 0);
 
 	return check_svme(ctxt);
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 2/6] KVM: nSVM: Simplify error handling of nested_svm_copy_vmcb12_to_cache()
  2026-03-06 21:08 [PATCH v2 0/6] KVM: nSVM: Fix vmcb12 mapping failure handling Yosry Ahmed
  2026-03-06 21:08 ` [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE Yosry Ahmed
@ 2026-03-06 21:08 ` Yosry Ahmed
  2026-03-12 18:13   ` Sean Christopherson
  2026-03-06 21:08 ` [PATCH v2 3/6] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation Yosry Ahmed
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-06 21:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

nested_svm_vmrun() currently stores the return value of
nested_svm_copy_vmcb12_to_cache() in a local variable 'err', separate
from the generally used 'ret' variable. This is done to have a single
call to kvm_skip_emulated_instruction(), such that we can store the
return value of kvm_skip_emulated_instruction() in 'ret', and then
re-check the return value of nested_svm_copy_vmcb12_to_cache() in 'err'.

The code is unnecessarily confusing. Instead, call
kvm_skip_emulated_instruction() in the failure path of
nested_svm_copy_vmcb12_to_cache() if the return value is not -EFAULT,
and drop 'err'.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/svm/nested.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index b191c6cab57db..6d4c053778b21 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1079,7 +1079,7 @@ static int nested_svm_copy_vmcb12_to_cache(struct kvm_vcpu *vcpu, u64 vmcb12_gpa
 int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
-	int ret, err;
+	int ret;
 	u64 vmcb12_gpa;
 	struct vmcb *vmcb01 = svm->vmcb01.ptr;
 
@@ -1104,19 +1104,20 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 		return -EINVAL;
 
 	vmcb12_gpa = svm->vmcb->save.rax;
-	err = nested_svm_copy_vmcb12_to_cache(vcpu, vmcb12_gpa);
-	if (err == -EFAULT) {
+	ret = nested_svm_copy_vmcb12_to_cache(vcpu, vmcb12_gpa);
+
+	/*
+	 * Advance RIP if #GP or #UD are not injected, but otherwise
+	 * stop if copying and checking vmcb12 failed.
+	 */
+	if (ret == -EFAULT) {
 		kvm_inject_gp(vcpu, 0);
 		return 1;
+	} else if (ret) {
+		return kvm_skip_emulated_instruction(vcpu);
 	}
 
-	/*
-	 * Advance RIP if #GP or #UD are not injected, but otherwise stop if
-	 * copying and checking vmcb12 failed.
-	 */
 	ret = kvm_skip_emulated_instruction(vcpu);
-	if (err)
-		return ret;
 
 	/*
 	 * Since vmcb01 is not in use, we can use it to store some of the L1
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 3/6] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation
  2026-03-06 21:08 [PATCH v2 0/6] KVM: nSVM: Fix vmcb12 mapping failure handling Yosry Ahmed
  2026-03-06 21:08 ` [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE Yosry Ahmed
  2026-03-06 21:08 ` [PATCH v2 2/6] KVM: nSVM: Simplify error handling of nested_svm_copy_vmcb12_to_cache() Yosry Ahmed
@ 2026-03-06 21:08 ` Yosry Ahmed
  2026-03-06 21:08 ` [PATCH v2 4/6] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails Yosry Ahmed
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-06 21:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

Currently, a #GP is only injected if kvm_vcpu_map() fails with -EINVAL.
But it could also fail with -EFAULT if creating a host mapping failed.
Inject a #GP in all cases, no reason to treat failure modes differently.

Similar to commit 01ddcdc55e09 ("KVM: nSVM: Always inject a #GP if
mapping VMCB12 fails on nested VMRUN"), treat all failures equally.

Fixes: 8c5fbf1a7231 ("KVM/nSVM: Use the new mapping API for mapping guest memory")
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/svm/svm.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3407deac90bd6..7ec0b0e8945fe 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2190,10 +2190,8 @@ static int vmload_vmsave_interception(struct kvm_vcpu *vcpu, bool vmload)
 	if (nested_svm_check_permissions(vcpu))
 		return 1;
 
-	ret = kvm_vcpu_map(vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map);
-	if (ret) {
-		if (ret == -EINVAL)
-			kvm_inject_gp(vcpu, 0);
+	if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map)) {
+		kvm_inject_gp(vcpu, 0);
 		return 1;
 	}
 
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 4/6] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails
  2026-03-06 21:08 [PATCH v2 0/6] KVM: nSVM: Fix vmcb12 mapping failure handling Yosry Ahmed
                   ` (2 preceding siblings ...)
  2026-03-06 21:08 ` [PATCH v2 3/6] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation Yosry Ahmed
@ 2026-03-06 21:08 ` Yosry Ahmed
  2026-03-07  1:09   ` Yosry Ahmed
  2026-03-06 21:08 ` [PATCH v2 5/6] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa Yosry Ahmed
  2026-03-06 21:09 ` [PATCH v2 6/6] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name Yosry Ahmed
  5 siblings, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-06 21:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

KVM currently injects a #GP if mapping vmcb12 fails when emulating
VMRUN/VMLOAD/VMSAVE. This is not architectural behavior, as #GP should
only be injected if the physical address is not supported or not aligned
(which hardware will do before the VMRUN intercept is checked).

Instead, handle it as an emulation failure, similar to how nVMX handles
failures to read/write guest memory in several emulation paths.

When virtual VMLOAD/VMSAVE is enabled, if vmcb12's GPA is not mapped in
the NPTs a VMEXIT(#NPF) will be generated, and KVM will install an MMIO
SPTE and emulate the instruction if there is no corresponding memslot.
x86_emulate_insn() will return EMULATION_FAILED as VMLOAD/VMSAVE are not
handled as part of the twobyte_insn cases.

Even though this will also result in an emulation failure, it will only
result in a straight return to userspace if
KVM_CAP_EXIT_ON_EMULATION_FAILURE is set. Otherwise, it would inject #UD
and only exit to userspace if not in guest mode. So the behavior is
slightly different if virtual VMLOAD/VMSAVE is enabled.

Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/svm/nested.c | 13 ++++++-------
 arch/x86/kvm/svm/svm.c    |  6 ++----
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 6d4c053778b21..089cdcfd60340 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1107,15 +1107,14 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 	ret = nested_svm_copy_vmcb12_to_cache(vcpu, vmcb12_gpa);
 
 	/*
-	 * Advance RIP if #GP or #UD are not injected, but otherwise
-	 * stop if copying and checking vmcb12 failed.
+	 * Advance RIP if instruction emulation completes, whether it's a
+	 * successful VMRUN or a failed one with #VMEXIT(INVALID), but not if
+	 * #GP/#UD is injected, or if reading vmcb12 fails.
 	 */
-	if (ret == -EFAULT) {
-		kvm_inject_gp(vcpu, 0);
-		return 1;
-	} else if (ret) {
+	if (ret == -EFAULT)
+		return kvm_handle_memory_failure(vcpu, X86EMUL_IO_NEEDED, NULL);
+	else if (ret)
 		return kvm_skip_emulated_instruction(vcpu);
-	}
 
 	ret = kvm_skip_emulated_instruction(vcpu);
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7ec0b0e8945fe..35433bd345eff 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2190,10 +2190,8 @@ static int vmload_vmsave_interception(struct kvm_vcpu *vcpu, bool vmload)
 	if (nested_svm_check_permissions(vcpu))
 		return 1;
 
-	if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map)) {
-		kvm_inject_gp(vcpu, 0);
-		return 1;
-	}
+	if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map))
+		return kvm_handle_memory_failure(vcpu, X86EMUL_IO_NEEDED, NULL);
 
 	vmcb12 = map.hva;
 
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 5/6] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa
  2026-03-06 21:08 [PATCH v2 0/6] KVM: nSVM: Fix vmcb12 mapping failure handling Yosry Ahmed
                   ` (3 preceding siblings ...)
  2026-03-06 21:08 ` [PATCH v2 4/6] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails Yosry Ahmed
@ 2026-03-06 21:08 ` Yosry Ahmed
  2026-03-06 21:09 ` [PATCH v2 6/6] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name Yosry Ahmed
  5 siblings, 0 replies; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-06 21:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

The test currently allegedly makes sure that VMRUN causes a #GP in
vmcb12 GPA is valid but unmappable. However, it calls run_guest() with
an the test vmcb12 GPA, and the #GP is produced from VMLOAD, not VMRUN.

Additionally, the underlying just changed to match architectural
behavior, and all of VMRUN/VMLOAD/VMSAVE fail emulation if vmcb12 cannot
be mapped. The CPU still injects a #GP if the vmcb12 GPA exceeds
maxphyaddr.

Rework the test such to use the KVM_ONE_VCPU_TEST[_SUITE] harness, and
test all of VMRUN/VMLOAD/VMSAVE with both an invalid GPA (-1ULL) causing
a #GP, and a valid but unmappable GPA causing emulation failure. Execute
the instructions directly from L1 instead of run_guest() to make sure
the #GP or emulation failure is produced by the right instruction.

Leave the #VMEXIT with unmappable GPA test case as-is, but wrap it with
a test harness as well.

Opportunisitically drop gp_triggered, as the test already checks that
a #GP was injected through a SYNC, and add an assertion that the max
legal GPA is in fact not mapped by userspace (i.e. KVM cannot map it).

Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 .../testing/selftests/kvm/include/kvm_util.h  |   1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    |   6 +
 .../kvm/x86/svm_nested_invalid_vmcb12_gpa.c   | 153 +++++++++++++-----
 3 files changed, 124 insertions(+), 36 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index 8b39cb919f4fc..61fb2cb7df288 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -734,6 +734,7 @@ void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa);
 void *addr_gva2hva(struct kvm_vm *vm, vm_vaddr_t gva);
 vm_paddr_t addr_hva2gpa(struct kvm_vm *vm, void *hva);
 void *addr_gpa2alias(struct kvm_vm *vm, vm_paddr_t gpa);
+bool addr_gpa_has_hva(struct kvm_vm *vm, vm_paddr_t gpa);
 
 #ifndef vcpu_arch_put_guest
 #define vcpu_arch_put_guest(mem, val) do { (mem) = (val); } while (0)
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 1959bf556e88e..9cf68a558c08b 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1710,6 +1710,12 @@ void *addr_gpa2alias(struct kvm_vm *vm, vm_paddr_t gpa)
 	return (void *) ((uintptr_t) region->host_alias + offset);
 }
 
+bool addr_gpa_has_hva(struct kvm_vm *vm, vm_paddr_t gpa)
+{
+	gpa = vm_untag_gpa(vm, gpa);
+	return !!userspace_mem_region_find(vm, gpa, gpa);
+}
+
 /* Create an interrupt controller chip for the specified VM. */
 void vm_create_irqchip(struct kvm_vm *vm)
 {
diff --git a/tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c b/tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c
index c6d5f712120d1..3eaee8ad90211 100644
--- a/tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c
+++ b/tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c
@@ -6,6 +6,8 @@
 #include "vmx.h"
 #include "svm_util.h"
 #include "kselftest.h"
+#include "kvm_test_harness.h"
+#include "test_util.h"
 
 
 #define L2_GUEST_STACK_SIZE 64
@@ -13,86 +15,165 @@
 #define SYNC_GP 101
 #define SYNC_L2_STARTED 102
 
-u64 valid_vmcb12_gpa;
-int gp_triggered;
+static unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
 
 static void guest_gp_handler(struct ex_regs *regs)
 {
-	GUEST_ASSERT(!gp_triggered);
 	GUEST_SYNC(SYNC_GP);
-	gp_triggered = 1;
-	regs->rax = valid_vmcb12_gpa;
 }
 
-static void l2_guest_code(void)
+static void l2_code(void)
 {
 	GUEST_SYNC(SYNC_L2_STARTED);
 	vmcall();
 }
 
-static void l1_guest_code(struct svm_test_data *svm, u64 invalid_vmcb12_gpa)
+static void l1_vmrun(struct svm_test_data *svm, u64 gpa)
 {
-	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+	generic_svm_setup(svm, l2_code, &l2_guest_stack[L2_GUEST_STACK_SIZE]);
 
-	generic_svm_setup(svm, l2_guest_code,
-			  &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+	asm volatile ("vmrun %[gpa]" : : [gpa] "a" (gpa) : "memory");
+}
+
+static void l1_vmload(struct svm_test_data *svm, u64 gpa)
+{
+	generic_svm_setup(svm, l2_code, &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+	asm volatile ("vmload %[gpa]" : : [gpa] "a" (gpa) : "memory");
+}
 
-	valid_vmcb12_gpa = svm->vmcb_gpa;
+static void l1_vmsave(struct svm_test_data *svm, u64 gpa)
+{
+	generic_svm_setup(svm, l2_code, &l2_guest_stack[L2_GUEST_STACK_SIZE]);
 
-	run_guest(svm->vmcb, invalid_vmcb12_gpa); /* #GP */
+	asm volatile ("vmsave %[gpa]" : : [gpa] "a" (gpa) : "memory");
+}
 
-	/* GP handler should jump here */
+static void l1_vmexit(struct svm_test_data *svm, u64 gpa)
+{
+	generic_svm_setup(svm, l2_code, &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+	run_guest(svm->vmcb, svm->vmcb_gpa);
 	GUEST_ASSERT(svm->vmcb->control.exit_code == SVM_EXIT_VMMCALL);
 	GUEST_DONE();
 }
 
-int main(int argc, char *argv[])
+/*
+ * Find the max legal GPA that is not backed by a memslot (i.e. cannot be mapped
+ * by KVM).
+ */
+static u64 unmappable_gpa(struct kvm_vcpu *vcpu)
 {
-	struct kvm_x86_state *state;
-	vm_vaddr_t nested_gva = 0;
-	struct kvm_vcpu *vcpu;
 	uint32_t maxphyaddr;
 	u64 max_legal_gpa;
-	struct kvm_vm *vm;
-	struct ucall uc;
-
-	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SVM));
-
-	vm = vm_create_with_one_vcpu(&vcpu, l1_guest_code);
-	vm_install_exception_handler(vcpu->vm, GP_VECTOR, guest_gp_handler);
 
-	/*
-	 * Find the max legal GPA that is not backed by a memslot (i.e. cannot
-	 * be mapped by KVM).
-	 */
 	maxphyaddr = kvm_cpuid_property(vcpu->cpuid, X86_PROPERTY_MAX_PHY_ADDR);
 	max_legal_gpa = BIT_ULL(maxphyaddr) - PAGE_SIZE;
-	vcpu_alloc_svm(vm, &nested_gva);
-	vcpu_args_set(vcpu, 2, nested_gva, max_legal_gpa);
+	TEST_ASSERT(!addr_gpa_has_hva(vcpu->vm, max_legal_gpa),
+		    "Expected max_legal_gpa to not be mapped by userspace");
+
+	return max_legal_gpa;
+}
+
+static void test_invalid_vmcb12(struct kvm_vcpu *vcpu)
+{
+	vm_vaddr_t nested_gva = 0;
+	struct ucall uc;
+
 
-	/* VMRUN with max_legal_gpa, KVM injects a #GP */
+	vm_install_exception_handler(vcpu->vm, GP_VECTOR, guest_gp_handler);
+	vcpu_alloc_svm(vcpu->vm, &nested_gva);
+	vcpu_args_set(vcpu, 2, nested_gva, -1ULL);
 	vcpu_run(vcpu);
+
 	TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
 	TEST_ASSERT_EQ(get_ucall(vcpu, &uc), UCALL_SYNC);
 	TEST_ASSERT_EQ(uc.args[1], SYNC_GP);
+}
+
+static void test_unmappable_vmcb12(struct kvm_vcpu *vcpu)
+{
+	vm_vaddr_t nested_gva = 0;
+
+	vcpu_alloc_svm(vcpu->vm, &nested_gva);
+	vcpu_args_set(vcpu, 2, nested_gva, unmappable_gpa(vcpu));
+	vcpu_run(vcpu);
+
+	TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_INTERNAL_ERROR);
+	TEST_ASSERT_EQ(vcpu->run->emulation_failure.suberror, KVM_INTERNAL_ERROR_EMULATION);
+}
+
+static void test_unmappable_vmcb12_vmexit(struct kvm_vcpu *vcpu)
+{
+	struct kvm_x86_state *state;
+	vm_vaddr_t nested_gva = 0;
+	struct ucall uc;
 
 	/*
-	 * Enter L2 (with a legit vmcb12 GPA), then overwrite vmcb12 GPA with
-	 * max_legal_gpa. KVM will fail to map vmcb12 on nested VM-Exit and
+	 * Enter L2 (with a legit vmcb12 GPA), then overwrite vmcb12 GPA with an
+	 * unmappable GPA. KVM will fail to map vmcb12 on nested VM-Exit and
 	 * cause a shutdown.
 	 */
+	vcpu_alloc_svm(vcpu->vm, &nested_gva);
+	vcpu_args_set(vcpu, 2, nested_gva, unmappable_gpa(vcpu));
 	vcpu_run(vcpu);
 	TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
 	TEST_ASSERT_EQ(get_ucall(vcpu, &uc), UCALL_SYNC);
 	TEST_ASSERT_EQ(uc.args[1], SYNC_L2_STARTED);
 
 	state = vcpu_save_state(vcpu);
-	state->nested.hdr.svm.vmcb_pa = max_legal_gpa;
+	state->nested.hdr.svm.vmcb_pa = unmappable_gpa(vcpu);
 	vcpu_load_state(vcpu, state);
 	vcpu_run(vcpu);
 	TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_SHUTDOWN);
 
 	kvm_x86_state_cleanup(state);
-	kvm_vm_free(vm);
-	return 0;
+}
+
+KVM_ONE_VCPU_TEST_SUITE(vmcb12_gpa);
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmrun_invalid, l1_vmrun)
+{
+	test_invalid_vmcb12(vcpu);
+}
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmload_invalid, l1_vmload)
+{
+	test_invalid_vmcb12(vcpu);
+}
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmsave_invalid, l1_vmsave)
+{
+	test_invalid_vmcb12(vcpu);
+}
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmrun_unmappable, l1_vmrun)
+{
+	test_unmappable_vmcb12(vcpu);
+}
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmload_unmappable, l1_vmload)
+{
+	test_unmappable_vmcb12(vcpu);
+}
+
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmsave_unmappable, l1_vmsave)
+{
+	test_unmappable_vmcb12(vcpu);
+}
+
+/*
+ * Invalid vmcb12_gpa cannot be test for #VMEXIT as KVM_SET_NESTED_STATE will
+ * reject it.
+ */
+KVM_ONE_VCPU_TEST(vmcb12_gpa, vmexit_unmappable, l1_vmexit)
+{
+	test_unmappable_vmcb12_vmexit(vcpu);
+}
+
+int main(int argc, char *argv[])
+{
+	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SVM));
+
+	return test_harness_run(argc, argv);
 }
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 6/6] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name
  2026-03-06 21:08 [PATCH v2 0/6] KVM: nSVM: Fix vmcb12 mapping failure handling Yosry Ahmed
                   ` (4 preceding siblings ...)
  2026-03-06 21:08 ` [PATCH v2 5/6] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa Yosry Ahmed
@ 2026-03-06 21:09 ` Yosry Ahmed
  5 siblings, 0 replies; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-06 21:09 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel, Yosry Ahmed

The test checks both invalid GPAs as well as unmappable GPAs, so drop
'invalid' from its name.

Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 tools/testing/selftests/kvm/Makefile.kvm                        | 2 +-
 ...{svm_nested_invalid_vmcb12_gpa.c => svm_nested_vmcb12_gpa.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename tools/testing/selftests/kvm/x86/{svm_nested_invalid_vmcb12_gpa.c => svm_nested_vmcb12_gpa.c} (100%)

diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
index ba87cd31872bd..83792d136ac3b 100644
--- a/tools/testing/selftests/kvm/Makefile.kvm
+++ b/tools/testing/selftests/kvm/Makefile.kvm
@@ -111,9 +111,9 @@ TEST_GEN_PROGS_x86 += x86/vmx_preemption_timer_test
 TEST_GEN_PROGS_x86 += x86/svm_vmcall_test
 TEST_GEN_PROGS_x86 += x86/svm_int_ctl_test
 TEST_GEN_PROGS_x86 += x86/svm_nested_clear_efer_svme
-TEST_GEN_PROGS_x86 += x86/svm_nested_invalid_vmcb12_gpa
 TEST_GEN_PROGS_x86 += x86/svm_nested_shutdown_test
 TEST_GEN_PROGS_x86 += x86/svm_nested_soft_inject_test
+TEST_GEN_PROGS_x86 += x86/svm_nested_vmcb12_gpa
 TEST_GEN_PROGS_x86 += x86/svm_lbr_nested_state
 TEST_GEN_PROGS_x86 += x86/tsc_scaling_sync
 TEST_GEN_PROGS_x86 += x86/sync_regs_test
diff --git a/tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c b/tools/testing/selftests/kvm/x86/svm_nested_vmcb12_gpa.c
similarity index 100%
rename from tools/testing/selftests/kvm/x86/svm_nested_invalid_vmcb12_gpa.c
rename to tools/testing/selftests/kvm/x86/svm_nested_vmcb12_gpa.c
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-06 21:08 ` [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE Yosry Ahmed
@ 2026-03-06 22:27   ` Jim Mattson
  2026-03-06 22:37     ` Yosry Ahmed
  0 siblings, 1 reply; 30+ messages in thread
From: Jim Mattson @ 2026-03-06 22:27 UTC (permalink / raw)
  To: Yosry Ahmed; +Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel

On Fri, Mar 6, 2026 at 1:09 PM Yosry Ahmed <yosry@kernel.org> wrote:
>
> Architecturally, VMRUN/VMLOAD/VMSAVE should generate a #GP if the
> physical address in RAX is not supported. check_svme_pa() hardcodes this
> to checking that bits 63-48 are not set. This is incorrect on HW
> supporting 52 bits of physical address space, so use maxphyaddr instead.
>
> Note that the host's maxphyaddr is used, not the guest, because the
> emulator path for VMLOAD/VMSAVE is generally used when virtual
> VMLOAD/VMSAVE is enabled AND a #NPF is generated. If a #NPF is not
> generated, the CPU will inject a #GP based on the host's maxphyaddr.  So
> this keeps the behavior consistent.
>
> If KVM wants to consistently inject a #GP based on the guest's
> maxphyaddr, it would need to disabled virtual VMLOAD/VMSAVE and
> intercept all VMLOAD/VMSAVE instructions to do the check.
>
> Also, emulating a smaller maxphyaddr for the guest than the host
> generally doesn't work well, so it's not worth handling this.

If we're going to throw in the towel on allow_smaller_maxphyaddr, the
code should be removed.

In any case, the check should logically be against the guest's
maxphyaddr, because the VMLOAD/VMSAVE instruction executes in guest
context.

Note that virtual VMLOAD/VMSAVE cannot be used if the guest's
maxphyaddr doesn't match the host's maxphyaddr.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-06 22:27   ` Jim Mattson
@ 2026-03-06 22:37     ` Yosry Ahmed
  2026-03-06 23:12       ` Jim Mattson
  0 siblings, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-06 22:37 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel

On Fri, Mar 6, 2026 at 2:27 PM Jim Mattson <jmattson@google.com> wrote:
>
> On Fri, Mar 6, 2026 at 1:09 PM Yosry Ahmed <yosry@kernel.org> wrote:
> >
> > Architecturally, VMRUN/VMLOAD/VMSAVE should generate a #GP if the
> > physical address in RAX is not supported. check_svme_pa() hardcodes this
> > to checking that bits 63-48 are not set. This is incorrect on HW
> > supporting 52 bits of physical address space, so use maxphyaddr instead.
> >
> > Note that the host's maxphyaddr is used, not the guest, because the
> > emulator path for VMLOAD/VMSAVE is generally used when virtual
> > VMLOAD/VMSAVE is enabled AND a #NPF is generated. If a #NPF is not
> > generated, the CPU will inject a #GP based on the host's maxphyaddr.  So
> > this keeps the behavior consistent.
> >
> > If KVM wants to consistently inject a #GP based on the guest's
> > maxphyaddr, it would need to disabled virtual VMLOAD/VMSAVE and
> > intercept all VMLOAD/VMSAVE instructions to do the check.
> >
> > Also, emulating a smaller maxphyaddr for the guest than the host
> > generally doesn't work well, so it's not worth handling this.
>
> If we're going to throw in the towel on allow_smaller_maxphyaddr, the
> code should be removed.
>
> In any case, the check should logically be against the guest's
> maxphyaddr, because the VMLOAD/VMSAVE instruction executes in guest
> context.

Right, but I am trying to have the #GP check for VMLOAD/VMSAVE behave
consistently with vls=1, whether it's done by the hardware or the
emulator.

>
> Note that virtual VMLOAD/VMSAVE cannot be used if the guest's
> maxphyaddr doesn't match the host's maxphyaddr.

Not sure what you mean? Do you mean it wouldn't be correct to use it?
AFAICT that doesn't prevent it from being enabled.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-06 22:37     ` Yosry Ahmed
@ 2026-03-06 23:12       ` Jim Mattson
  2026-03-06 23:20         ` Yosry Ahmed
  2026-03-07  0:28         ` Sean Christopherson
  0 siblings, 2 replies; 30+ messages in thread
From: Jim Mattson @ 2026-03-06 23:12 UTC (permalink / raw)
  To: Yosry Ahmed; +Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel

On Fri, Mar 6, 2026 at 2:37 PM Yosry Ahmed <yosry@kernel.org> wrote:
>
> On Fri, Mar 6, 2026 at 2:27 PM Jim Mattson <jmattson@google.com> wrote:
> >
> > On Fri, Mar 6, 2026 at 1:09 PM Yosry Ahmed <yosry@kernel.org> wrote:
> > >
> > > Architecturally, VMRUN/VMLOAD/VMSAVE should generate a #GP if the
> > > physical address in RAX is not supported. check_svme_pa() hardcodes this
> > > to checking that bits 63-48 are not set. This is incorrect on HW
> > > supporting 52 bits of physical address space, so use maxphyaddr instead.
> > >
> > > Note that the host's maxphyaddr is used, not the guest, because the
> > > emulator path for VMLOAD/VMSAVE is generally used when virtual
> > > VMLOAD/VMSAVE is enabled AND a #NPF is generated. If a #NPF is not
> > > generated, the CPU will inject a #GP based on the host's maxphyaddr.  So
> > > this keeps the behavior consistent.
> > >
> > > If KVM wants to consistently inject a #GP based on the guest's
> > > maxphyaddr, it would need to disabled virtual VMLOAD/VMSAVE and
> > > intercept all VMLOAD/VMSAVE instructions to do the check.
> > >
> > > Also, emulating a smaller maxphyaddr for the guest than the host
> > > generally doesn't work well, so it's not worth handling this.
> >
> > If we're going to throw in the towel on allow_smaller_maxphyaddr, the
> > code should be removed.
> >
> > In any case, the check should logically be against the guest's
> > maxphyaddr, because the VMLOAD/VMSAVE instruction executes in guest
> > context.
>
> Right, but I am trying to have the #GP check for VMLOAD/VMSAVE behave
> consistently with vls=1, whether it's done by the hardware or the
> emulator.

Consistency should not be an issue, since VLS cannot be enabled when
the MAXPHYADDRs differ. VLS doesn't work in that scenario.

> >
> > Note that virtual VMLOAD/VMSAVE cannot be used if the guest's
> > maxphyaddr doesn't match the host's maxphyaddr.
>
> Not sure what you mean? Do you mean it wouldn't be correct to use it?
> AFAICT that doesn't prevent it from being enabled.

It is incorrect to use VLS when it doesn't work.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-06 23:12       ` Jim Mattson
@ 2026-03-06 23:20         ` Yosry Ahmed
  2026-03-06 23:45           ` Jim Mattson
  2026-03-07  0:32           ` Sean Christopherson
  2026-03-07  0:28         ` Sean Christopherson
  1 sibling, 2 replies; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-06 23:20 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel

> > Right, but I am trying to have the #GP check for VMLOAD/VMSAVE behave
> > consistently with vls=1, whether it's done by the hardware or the
> > emulator.
>
> Consistency should not be an issue, since VLS cannot be enabled when
> the MAXPHYADDRs differ. VLS doesn't work in that scenario.

Why? It's only broken if VMLOAD/VMSAVE is executed with a GPA that
exceeds the guest's MAXPHYADDR, but not the host's, right? So only
broken if the guest is misbehaving.

Taking a step back, I am not disagreeing that VLS should not be used
with different MAXPHYADDRs, I am just saying it might be.

All that being said, I am fine with using cpuid_maxphyaddr(vcpu)
instead of kvm_host.maxphyaddr. Will wait for Sean's feedback to
figure out if a new version is needed.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-06 23:20         ` Yosry Ahmed
@ 2026-03-06 23:45           ` Jim Mattson
  2026-03-07  0:32           ` Sean Christopherson
  1 sibling, 0 replies; 30+ messages in thread
From: Jim Mattson @ 2026-03-06 23:45 UTC (permalink / raw)
  To: Yosry Ahmed; +Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel

On Fri, Mar 6, 2026 at 3:20 PM Yosry Ahmed <yosry@kernel.org> wrote:
>
> > > Right, but I am trying to have the #GP check for VMLOAD/VMSAVE behave
> > > consistently with vls=1, whether it's done by the hardware or the
> > > emulator.
> >
> > Consistency should not be an issue, since VLS cannot be enabled when
> > the MAXPHYADDRs differ. VLS doesn't work in that scenario.
>
> Why? It's only broken if VMLOAD/VMSAVE is executed with a GPA that
> exceeds the guest's MAXPHYADDR, but not the host's, right? So only
> broken if the guest is misbehaving.

"Misbehaving" is a tad pejorative. Faulting behavior is part of the
architectural specification. A less biased assessment is that VLS is
partially correct when the MAXPHYADDRs don't match.

People thought it was a big deal when FDIV produced incorrect results
one out of 10 billion times.

> Taking a step back, I am not disagreeing that VLS should not be used
> with different MAXPHYADDRs, I am just saying it might be.

That would be wrong.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-06 23:12       ` Jim Mattson
  2026-03-06 23:20         ` Yosry Ahmed
@ 2026-03-07  0:28         ` Sean Christopherson
  2026-03-07  0:31           ` Yosry Ahmed
  1 sibling, 1 reply; 30+ messages in thread
From: Sean Christopherson @ 2026-03-07  0:28 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Yosry Ahmed, Paolo Bonzini, kvm, linux-kernel

On Fri, Mar 06, 2026, Jim Mattson wrote:
> On Fri, Mar 6, 2026 at 2:37 PM Yosry Ahmed <yosry@kernel.org> wrote:
> >
> > On Fri, Mar 6, 2026 at 2:27 PM Jim Mattson <jmattson@google.com> wrote:
> > >
> > > On Fri, Mar 6, 2026 at 1:09 PM Yosry Ahmed <yosry@kernel.org> wrote:
> > > >
> > > > Architecturally, VMRUN/VMLOAD/VMSAVE should generate a #GP if the
> > > > physical address in RAX is not supported. check_svme_pa() hardcodes this
> > > > to checking that bits 63-48 are not set. This is incorrect on HW
> > > > supporting 52 bits of physical address space, so use maxphyaddr instead.
> > > >
> > > > Note that the host's maxphyaddr is used, not the guest, because the
> > > > emulator path for VMLOAD/VMSAVE is generally used when virtual
> > > > VMLOAD/VMSAVE is enabled AND a #NPF is generated. If a #NPF is not
> > > > generated, the CPU will inject a #GP based on the host's maxphyaddr.  So
> > > > this keeps the behavior consistent.
> > > >
> > > > If KVM wants to consistently inject a #GP based on the guest's
> > > > maxphyaddr, it would need to disabled virtual VMLOAD/VMSAVE and
> > > > intercept all VMLOAD/VMSAVE instructions to do the check.
> > > >
> > > > Also, emulating a smaller maxphyaddr for the guest than the host
> > > > generally doesn't work well, so it's not worth handling this.
> > >
> > > If we're going to throw in the towel on allow_smaller_maxphyaddr, the
> > > code should be removed.
> > >
> > > In any case, the check should logically be against the guest's
> > > maxphyaddr, because the VMLOAD/VMSAVE instruction executes in guest
> > > context.
> >
> > Right, but I am trying to have the #GP check for VMLOAD/VMSAVE behave
> > consistently with vls=1, whether it's done by the hardware or the
> > emulator.
> 
> Consistency should not be an issue, since VLS cannot be enabled when
> the MAXPHYADDRs differ. VLS doesn't work in that scenario.
> 
> > >
> > > Note that virtual VMLOAD/VMSAVE cannot be used if the guest's
> > > maxphyaddr doesn't match the host's maxphyaddr.
> >
> > Not sure what you mean? Do you mean it wouldn't be correct to use it?
> > AFAICT that doesn't prevent it from being enabled.

It does, actually.  KVM doesn't support allow_smaller_maxphyaddr when NPT is
enabled, because AMD CPUs (and now some Intel CPUs, lolz) do A/D updates before
signalling the reserved #NPF.

	allow_smaller_maxphyaddr = !npt_enabled;


And vls is disabled if NPT is disabled, for all the reasons Jim is pointing out.

	if (vls) {
		if (!npt_enabled ||
		    !boot_cpu_has(X86_FEATURE_V_VMSAVE_VMLOAD) ||
		    !IS_ENABLED(CONFIG_X86_64)) {
			vls = false;
		} else {
			pr_info("Virtual VMLOAD VMSAVE supported\n");
		}
	}

Thus running with allow_smaller_maxphyaddr+vls is impossible.

> It is incorrect to use VLS when it doesn't work.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-07  0:28         ` Sean Christopherson
@ 2026-03-07  0:31           ` Yosry Ahmed
  0 siblings, 0 replies; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-07  0:31 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

> > > >
> > > > Note that virtual VMLOAD/VMSAVE cannot be used if the guest's
> > > > maxphyaddr doesn't match the host's maxphyaddr.
> > >
> > > Not sure what you mean? Do you mean it wouldn't be correct to use it?
> > > AFAICT that doesn't prevent it from being enabled.
>
> It does, actually.  KVM doesn't support allow_smaller_maxphyaddr when NPT is
> enabled, because AMD CPUs (and now some Intel CPUs, lolz) do A/D updates before
> signalling the reserved #NPF.
>
>         allow_smaller_maxphyaddr = !npt_enabled;
>
>
> And vls is disabled if NPT is disabled, for all the reasons Jim is pointing out.
>
>         if (vls) {
>                 if (!npt_enabled ||
>                     !boot_cpu_has(X86_FEATURE_V_VMSAVE_VMLOAD) ||
>                     !IS_ENABLED(CONFIG_X86_64)) {
>                         vls = false;
>                 } else {
>                         pr_info("Virtual VMLOAD VMSAVE supported\n");
>                 }
>         }
>
> Thus running with allow_smaller_maxphyaddr+vls is impossible.

Oh, well never mind then.. :)

Will switch to cpuid_maxphyaddr(vcpu) in the next version (assuming
there's one).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-06 23:20         ` Yosry Ahmed
  2026-03-06 23:45           ` Jim Mattson
@ 2026-03-07  0:32           ` Sean Christopherson
  2026-03-11 18:31             ` Yosry Ahmed
  1 sibling, 1 reply; 30+ messages in thread
From: Sean Christopherson @ 2026-03-07  0:32 UTC (permalink / raw)
  To: Yosry Ahmed; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

On Fri, Mar 06, 2026, Yosry Ahmed wrote:
> > > Right, but I am trying to have the #GP check for VMLOAD/VMSAVE behave
> > > consistently with vls=1, whether it's done by the hardware or the
> > > emulator.
> >
> > Consistency should not be an issue, since VLS cannot be enabled when
> > the MAXPHYADDRs differ. VLS doesn't work in that scenario.
> 
> Why? It's only broken if VMLOAD/VMSAVE is executed with a GPA that
> exceeds the guest's MAXPHYADDR, but not the host's, right? So only
> broken if the guest is misbehaving.
> 
> Taking a step back, I am not disagreeing that VLS should not be used
> with different MAXPHYADDRs, I am just saying it might be.

KVM straight up doesn't support that (see my other reply).

> All that being said, I am fine with using cpuid_maxphyaddr(vcpu)
> instead of kvm_host.maxphyaddr. Will wait for Sean's feedback to
> figure out if a new version is needed.

LOL, Jim and I are of one mind when it comes to guest.MAXPHYADDR.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/6] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails
  2026-03-06 21:08 ` [PATCH v2 4/6] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails Yosry Ahmed
@ 2026-03-07  1:09   ` Yosry Ahmed
  2026-03-09 13:56     ` Yosry Ahmed
  0 siblings, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-07  1:09 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel

On Fri, Mar 6, 2026 at 1:09 PM Yosry Ahmed <yosry@kernel.org> wrote:
>
> KVM currently injects a #GP if mapping vmcb12 fails when emulating
> VMRUN/VMLOAD/VMSAVE. This is not architectural behavior, as #GP should
> only be injected if the physical address is not supported or not aligned
> (which hardware will do before the VMRUN intercept is checked).
>
> Instead, handle it as an emulation failure, similar to how nVMX handles
> failures to read/write guest memory in several emulation paths.
>
> When virtual VMLOAD/VMSAVE is enabled, if vmcb12's GPA is not mapped in
> the NPTs a VMEXIT(#NPF) will be generated, and KVM will install an MMIO
> SPTE and emulate the instruction if there is no corresponding memslot.
> x86_emulate_insn() will return EMULATION_FAILED as VMLOAD/VMSAVE are not
> handled as part of the twobyte_insn cases.
>
> Even though this will also result in an emulation failure, it will only
> result in a straight return to userspace if
> KVM_CAP_EXIT_ON_EMULATION_FAILURE is set. Otherwise, it would inject #UD
> and only exit to userspace if not in guest mode. So the behavior is
> slightly different if virtual VMLOAD/VMSAVE is enabled.
>
> Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> Reported-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Yosry Ahmed <yosry@kernel.org>
> ---

Nice find from AI bot, we should probably update gp_interception() to
make sure we reinject a #GP if the address exceeds MAXPHYADDR.
Something like:

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 5362443f4bbce..1c52d6d59c480 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2320,7 +2320,8 @@ static int gp_interception(struct kvm_vcpu *vcpu)
                                EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
        } else {
                /* All SVM instructions expect page aligned RAX */
-               if (svm->vmcb->save.rax & ~PAGE_MASK)
+               if (svm->vmcb->save.rax & ~PAGE_MASK ||
+                   svm->vmcb->save.rax & rsvd_bits(cpuid_maxphyaddr(vcpu), 63))
                        goto reinject;

                return emulate_svm_instr(vcpu, opcode);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/6] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails
  2026-03-07  1:09   ` Yosry Ahmed
@ 2026-03-09 13:56     ` Yosry Ahmed
  0 siblings, 0 replies; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-09 13:56 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel

On Fri, Mar 6, 2026 at 5:09 PM Yosry Ahmed <yosry@kernel.org> wrote:
>
> On Fri, Mar 6, 2026 at 1:09 PM Yosry Ahmed <yosry@kernel.org> wrote:
> >
> > KVM currently injects a #GP if mapping vmcb12 fails when emulating
> > VMRUN/VMLOAD/VMSAVE. This is not architectural behavior, as #GP should
> > only be injected if the physical address is not supported or not aligned
> > (which hardware will do before the VMRUN intercept is checked).
> >
> > Instead, handle it as an emulation failure, similar to how nVMX handles
> > failures to read/write guest memory in several emulation paths.
> >
> > When virtual VMLOAD/VMSAVE is enabled, if vmcb12's GPA is not mapped in
> > the NPTs a VMEXIT(#NPF) will be generated, and KVM will install an MMIO
> > SPTE and emulate the instruction if there is no corresponding memslot.
> > x86_emulate_insn() will return EMULATION_FAILED as VMLOAD/VMSAVE are not
> > handled as part of the twobyte_insn cases.
> >
> > Even though this will also result in an emulation failure, it will only
> > result in a straight return to userspace if
> > KVM_CAP_EXIT_ON_EMULATION_FAILURE is set. Otherwise, it would inject #UD
> > and only exit to userspace if not in guest mode. So the behavior is
> > slightly different if virtual VMLOAD/VMSAVE is enabled.
> >
> > Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> > Reported-by: Jim Mattson <jmattson@google.com>
> > Signed-off-by: Yosry Ahmed <yosry@kernel.org>
> > ---
>
> Nice find from AI bot, we should probably update gp_interception() to
> make sure we reinject a #GP if the address exceeds MAXPHYADDR.
> Something like:

Actually we should probably add a helper (e.g. svm_instr_should_gp()
or svm_instr_check_rax()) to figure out if we need to #GP on RAX for
VMRUN/VMLOAD/VMSAVE, and use it in both gp_interception() and
check_svme_pa() -- the latter doesn't have the misaligned page check
(although I think in practice we might not need it).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-07  0:32           ` Sean Christopherson
@ 2026-03-11 18:31             ` Yosry Ahmed
  2026-03-11 20:07               ` Yosry Ahmed
  0 siblings, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-11 18:31 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

On Fri, Mar 6, 2026 at 4:32 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Mar 06, 2026, Yosry Ahmed wrote:
> > > > Right, but I am trying to have the #GP check for VMLOAD/VMSAVE behave
> > > > consistently with vls=1, whether it's done by the hardware or the
> > > > emulator.
> > >
> > > Consistency should not be an issue, since VLS cannot be enabled when
> > > the MAXPHYADDRs differ. VLS doesn't work in that scenario.
> >
> > Why? It's only broken if VMLOAD/VMSAVE is executed with a GPA that
> > exceeds the guest's MAXPHYADDR, but not the host's, right? So only
> > broken if the guest is misbehaving.
> >
> > Taking a step back, I am not disagreeing that VLS should not be used
> > with different MAXPHYADDRs, I am just saying it might be.
>
> KVM straight up doesn't support that (see my other reply).

Sean, I intend to send a new version today with 2 main diffs:
- Use cpuid_maxphyaddr() here instead of  kvm_host.maxphyaddr.
- Use a common helper for checking RAX for SVM instructions for both
the emulator and gp_interception() (see response on patch 4).

Holler if you want me to wait for further feedback.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-11 18:31             ` Yosry Ahmed
@ 2026-03-11 20:07               ` Yosry Ahmed
  2026-03-11 20:39                 ` Sean Christopherson
  0 siblings, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-11 20:07 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

On Wed, Mar 11, 2026 at 11:31 AM Yosry Ahmed <yosry@kernel.org> wrote:
>
> On Fri, Mar 6, 2026 at 4:32 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Mar 06, 2026, Yosry Ahmed wrote:
> > > > > Right, but I am trying to have the #GP check for VMLOAD/VMSAVE behave
> > > > > consistently with vls=1, whether it's done by the hardware or the
> > > > > emulator.
> > > >
> > > > Consistency should not be an issue, since VLS cannot be enabled when
> > > > the MAXPHYADDRs differ. VLS doesn't work in that scenario.
> > >
> > > Why? It's only broken if VMLOAD/VMSAVE is executed with a GPA that
> > > exceeds the guest's MAXPHYADDR, but not the host's, right? So only
> > > broken if the guest is misbehaving.
> > >
> > > Taking a step back, I am not disagreeing that VLS should not be used
> > > with different MAXPHYADDRs, I am just saying it might be.
> >
> > KVM straight up doesn't support that (see my other reply).
>
> Sean, I intend to send a new version today with 2 main diffs:
> - Use cpuid_maxphyaddr() here instead of  kvm_host.maxphyaddr.
> - Use a common helper for checking RAX for SVM instructions for both
> the emulator and gp_interception() (see response on patch 4).
>
> Holler if you want me to wait for further feedback.

I just realized I cannot just do cpuid_maxphyaddr(ctxt->vcpu) in
check_svme_pa(), because vcpu is defined as a void pointer in
x86_emulate_ctxt. Looking at commit c9b8b07cded5 ("KVM: x86:
Dynamically allocate per-vCPU emulation context"), I cannot tell why.

I was going to move emul_to_vcpu() to arch/x86/kvm/kvm_emulate.h, but
maybe we should just make this a normal struct kvm_vpcu pointer and
drop emul_to_vcpu() completely?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-11 20:07               ` Yosry Ahmed
@ 2026-03-11 20:39                 ` Sean Christopherson
  2026-03-11 20:50                   ` Yosry Ahmed
  0 siblings, 1 reply; 30+ messages in thread
From: Sean Christopherson @ 2026-03-11 20:39 UTC (permalink / raw)
  To: Yosry Ahmed; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

On Wed, Mar 11, 2026, Yosry Ahmed wrote:
> On Wed, Mar 11, 2026 at 11:31 AM Yosry Ahmed <yosry@kernel.org> wrote:
> >
> > On Fri, Mar 6, 2026 at 4:32 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Fri, Mar 06, 2026, Yosry Ahmed wrote:
> > > > > > Right, but I am trying to have the #GP check for VMLOAD/VMSAVE behave
> > > > > > consistently with vls=1, whether it's done by the hardware or the
> > > > > > emulator.
> > > > >
> > > > > Consistency should not be an issue, since VLS cannot be enabled when
> > > > > the MAXPHYADDRs differ. VLS doesn't work in that scenario.
> > > >
> > > > Why? It's only broken if VMLOAD/VMSAVE is executed with a GPA that
> > > > exceeds the guest's MAXPHYADDR, but not the host's, right? So only
> > > > broken if the guest is misbehaving.
> > > >
> > > > Taking a step back, I am not disagreeing that VLS should not be used
> > > > with different MAXPHYADDRs, I am just saying it might be.
> > >
> > > KVM straight up doesn't support that (see my other reply).
> >
> > Sean, I intend to send a new version today with 2 main diffs:
> > - Use cpuid_maxphyaddr() here instead of  kvm_host.maxphyaddr.
> > - Use a common helper for checking RAX for SVM instructions for both
> > the emulator and gp_interception() (see response on patch 4).
> >
> > Holler if you want me to wait for further feedback.
> 
> I just realized I cannot just do cpuid_maxphyaddr(ctxt->vcpu) in
> check_svme_pa(), because vcpu is defined as a void pointer in
> x86_emulate_ctxt. Looking at commit c9b8b07cded5 ("KVM: x86:
> Dynamically allocate per-vCPU emulation context"), I cannot tell why.

To prevent dereferencing the vcpu object in emulator code.  It's kinda silly
because common KVM is tightly coupled to the emulator, but we try to contain
what the emulator can do.

> I was going to move emul_to_vcpu() to arch/x86/kvm/kvm_emulate.h, but
> maybe we should just make this a normal struct kvm_vpcu pointer and
> drop emul_to_vcpu() completely?

Heh, talk about scope creep, that'll open a much bigger can of worms and subsequent
discussion.

Honestly, why bother keeping check_svme_pa()?  Can't we just do the checks in
svm_check_intercept()?  E.g. vmx_check_intercept() already "injects" #UD for RDTSCP.

Ha!  And dropping check_svme_pa() would technically be a bug fix, because the #GP
is supposed to have lower priority than the #UD due to EFER.SVME=0.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-11 20:39                 ` Sean Christopherson
@ 2026-03-11 20:50                   ` Yosry Ahmed
  2026-03-11 23:01                     ` Sean Christopherson
  0 siblings, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-11 20:50 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

> > > Sean, I intend to send a new version today with 2 main diffs:
> > > - Use cpuid_maxphyaddr() here instead of  kvm_host.maxphyaddr.
> > > - Use a common helper for checking RAX for SVM instructions for both
> > > the emulator and gp_interception() (see response on patch 4).
> > >
> > > Holler if you want me to wait for further feedback.
> >
> > I just realized I cannot just do cpuid_maxphyaddr(ctxt->vcpu) in
> > check_svme_pa(), because vcpu is defined as a void pointer in
> > x86_emulate_ctxt. Looking at commit c9b8b07cded5 ("KVM: x86:
> > Dynamically allocate per-vCPU emulation context"), I cannot tell why.
>
> To prevent dereferencing the vcpu object in emulator code.  It's kinda silly
> because common KVM is tightly coupled to the emulator, but we try to contain
> what the emulator can do.
>
> > I was going to move emul_to_vcpu() to arch/x86/kvm/kvm_emulate.h, but
> > maybe we should just make this a normal struct kvm_vpcu pointer and
> > drop emul_to_vcpu() completely?
>
> Heh, talk about scope creep, that'll open a much bigger can of worms and subsequent
> discussion.

Ack.

> Honestly, why bother keeping check_svme_pa()?  Can't we just do the checks in
> svm_check_intercept()?  E.g. vmx_check_intercept() already "injects" #UD for RDTSCP.

Hmm svm_check_intercept() isn't semantically the right place AFAICT,
and more importantly, it's only called if the instruction is executed
in guest mode (i.e. in L2+).

> Ha!  And dropping check_svme_pa() would technically be a bug fix, because the #GP
> is supposed to have lower priority than the #UD due to EFER.SVME=0.

That can probably be fixed by calling check_svme() before checking RAX
in check_svme_pa().

I think we should keep check_svme_pa(), but we'll need to extract the
vCPU from the emulation context one way or another to check
MAXPHYADDR. We can cast ctxt->vcpu or pull emul_to_vcpu() into
arch/x86/kvm/kvm_emulate.h and use it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-11 20:50                   ` Yosry Ahmed
@ 2026-03-11 23:01                     ` Sean Christopherson
  2026-03-11 23:22                       ` Yosry Ahmed
  2026-03-12 15:50                       ` Yosry Ahmed
  0 siblings, 2 replies; 30+ messages in thread
From: Sean Christopherson @ 2026-03-11 23:01 UTC (permalink / raw)
  To: Yosry Ahmed; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

On Wed, Mar 11, 2026, Yosry Ahmed wrote:
> > > > Sean, I intend to send a new version today with 2 main diffs:
> > > > - Use cpuid_maxphyaddr() here instead of  kvm_host.maxphyaddr.
> > > > - Use a common helper for checking RAX for SVM instructions for both
> > > > the emulator and gp_interception() (see response on patch 4).
> > > >
> > > > Holler if you want me to wait for further feedback.
> > >
> > > I just realized I cannot just do cpuid_maxphyaddr(ctxt->vcpu) in
> > > check_svme_pa(), because vcpu is defined as a void pointer in
> > > x86_emulate_ctxt. Looking at commit c9b8b07cded5 ("KVM: x86:
> > > Dynamically allocate per-vCPU emulation context"), I cannot tell why.
> >
> > To prevent dereferencing the vcpu object in emulator code.  It's kinda silly
> > because common KVM is tightly coupled to the emulator, but we try to contain
> > what the emulator can do.
> >
> > > I was going to move emul_to_vcpu() to arch/x86/kvm/kvm_emulate.h, but
> > > maybe we should just make this a normal struct kvm_vpcu pointer and
> > > drop emul_to_vcpu() completely?
> >
> > Heh, talk about scope creep, that'll open a much bigger can of worms and subsequent
> > discussion.
> 
> Ack.
> 
> > Honestly, why bother keeping check_svme_pa()?  Can't we just do the checks in
> > svm_check_intercept()?  E.g. vmx_check_intercept() already "injects" #UD for RDTSCP.
> 
> Hmm svm_check_intercept() isn't semantically the right place AFAICT,

Actually, I think it is.  Per the APM:

  Generally, instruction intercepts are checked after simple exceptions (such as
  #GP—when CPL is incorrect—or #UD) have been checked, but before exceptions
  related to memory accesses (such as page faults) and exceptions based on
  specific operand values.

This #GP is operand specific.

> and more importantly, it's only called if the instruction is executed
> in guest mode (i.e. in L2+).

Hold up, we're getting ahead of ourselves.

The only legitimate reason the emulator is at all aware of VMSAVE, VMLOAD, and
VMRUN is to deal with #GP due to the RAX check, because hardware checks the GPA
against the host's physical address space.  See commit 82a11e9c6fa2 ("KVM: SVM:
Add emulation support for #GP triggered by SVM instructions").

The emulator "support" was originally added by commit 01de8b09e606 ("KVM: SVM:
Add intercept checks for SVM instructions"), but AFAICT, for all intents and
purposes that was dead code when it was added, because the emulator doesn't
actually _emulate_ the instructions.  I assume if they aren't intercepted, and
KVM is full on emulating instead of just decoding, they end up at EMULATION_FAILED
and get a #UD or something.

Outside of forced emulation or code stream rewriting, KVM should _never_ fully
emulate any of the SVM instructions except VMMCALL (and that is a super special
case).  KVM does need to _decode_ the instruction, and it needs to get the
pre-intercept exception checks correct so that KVM correctly injects e.g. #GP
instead of synthesizing a #VMEXIT for the CPL check, but KVM doesn't need to do
*all* of the checks.

Note, for L2, the SVME check is meaningless, as EFER.SVME has to be set for L2
to be active, i.e. it's L1's responsibility to handle that check.

Back to the physical address thing, KVM _already_ handles that check in the #GP
path, it's just wrong too:

		/* All SVM instructions expect page aligned RAX */
		if (svm->vmcb->save.rax & ~PAGE_MASK)
			goto reinject;

So I think what we want is to

  (a) fix the RAX check in gp_interception()
  (b) drop the RAX check in the emulator
  (c) add a CPL check in the emulator (because the intercepted #GP could have
      been due to L2 executing at CPL>0, not due to a bad-but-good RAX).

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index c8e292e9a24d..74df977a38ca 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3867,18 +3867,10 @@ static int check_svme(struct x86_emulate_ctxt *ctxt)
        if (!(efer & EFER_SVME))
                return emulate_ud(ctxt);
 
-       return X86EMUL_CONTINUE;
-}
-
-static int check_svme_pa(struct x86_emulate_ctxt *ctxt)
-{
-       u64 rax = reg_read(ctxt, VCPU_REGS_RAX);
-
-       /* Valid physical address? */
-       if (rax & 0xffff000000000000ULL)
+       if (ctxt->ops->cpl(ctxt))
                return emulate_gp(ctxt, 0);
 
-       return check_svme(ctxt);
+       return X86EMUL_CONTINUE;
 }
 
 static int check_rdtsc(struct x86_emulate_ctxt *ctxt)
@@ -3984,10 +3976,10 @@ static const struct opcode group7_rm2[] = {
 };
 
 static const struct opcode group7_rm3[] = {
-       DIP(SrcNone | Prot | Priv,              vmrun,          check_svme_pa),
+       DIP(SrcNone | Prot | Priv,              vmrun,          check_svme),
        II(SrcNone  | Prot | EmulateOnUD,       em_hypercall,   vmmcall),
-       DIP(SrcNone | Prot | Priv,              vmload,         check_svme_pa),
-       DIP(SrcNone | Prot | Priv,              vmsave,         check_svme_pa),
+       DIP(SrcNone | Prot | Priv,              vmload,         check_svme),
+       DIP(SrcNone | Prot | Priv,              vmsave,         check_svme),
        DIP(SrcNone | Prot | Priv,              stgi,           check_svme),
        DIP(SrcNone | Prot | Priv,              clgi,           check_svme),
        DIP(SrcNone | Prot | Priv,              skinit,         check_svme),
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e6691c044913..e1223c07593b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2294,7 +2294,7 @@ static int gp_interception(struct kvm_vcpu *vcpu)
                                EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
        } else {
                /* All SVM instructions expect page aligned RAX */
-               if (svm->vmcb->save.rax & ~PAGE_MASK)
+               if (!page_address_valid(vcpu, svm->vmcb->save.rax))
                        goto reinject;
 
                return emulate_svm_instr(vcpu, opcode);


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-11 23:01                     ` Sean Christopherson
@ 2026-03-11 23:22                       ` Yosry Ahmed
  2026-03-12  1:27                         ` Yosry Ahmed
  2026-03-12 15:50                       ` Yosry Ahmed
  1 sibling, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-11 23:22 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

> > > Honestly, why bother keeping check_svme_pa()?  Can't we just do the checks in
> > > svm_check_intercept()?  E.g. vmx_check_intercept() already "injects" #UD for RDTSCP.
> >
> > Hmm svm_check_intercept() isn't semantically the right place AFAICT,
>
> Actually, I think it is.  Per the APM:
>
>   Generally, instruction intercepts are checked after simple exceptions (such as
>   #GP—when CPL is incorrect—or #UD) have been checked, but before exceptions
>   related to memory accesses (such as page faults) and exceptions based on
>   specific operand values.
>
> This #GP is operand specific.

Yes, but from KVM's perspective, svm_check_intercept() is meant to
check if the instruction executed by L2 needs to be intercepted by L1.
What this patch is doing is not specific to instructions executed by
L2.

Anyway, that's irrelevant now (see below).

>
> > and more importantly, it's only called if the instruction is executed
> > in guest mode (i.e. in L2+).
>
> Hold up, we're getting ahead of ourselves.
>
> The only legitimate reason the emulator is at all aware of VMSAVE, VMLOAD, and
> VMRUN is to deal with #GP due to the RAX check, because hardware checks the GPA
> against the host's physical address space.  See commit 82a11e9c6fa2 ("KVM: SVM:
> Add emulation support for #GP triggered by SVM instructions").
>
> The emulator "support" was originally added by commit 01de8b09e606 ("KVM: SVM:
> Add intercept checks for SVM instructions"), but AFAICT, for all intents and
> purposes that was dead code when it was added, because the emulator doesn't
> actually _emulate_ the instructions.  I assume if they aren't intercepted, and
> KVM is full on emulating instead of just decoding, they end up at EMULATION_FAILED
> and get a #UD or something.
>
> Outside of forced emulation or code stream rewriting, KVM should _never_ fully
> emulate any of the SVM instructions except VMMCALL (and that is a super special
> case).  KVM does need to _decode_ the instruction, and it needs to get the
> pre-intercept exception checks correct so that KVM correctly injects e.g. #GP
> instead of synthesizing a #VMEXIT for the CPL check, but KVM doesn't need to do
> *all* of the checks.
>
> Note, for L2, the SVME check is meaningless, as EFER.SVME has to be set for L2
> to be active, i.e. it's L1's responsibility to handle that check.
>
> Back to the physical address thing, KVM _already_ handles that check in the #GP
> path,

I guess if KVM is not intercepting #GP, then the hardware injects the
#GP and the emulator still doesn't have to worry about it -- because
we don't support the case where RAX can be legal from the host's
perspective but not the guest's. Makes sense.

> it's just wrong too:
>
>                 /* All SVM instructions expect page aligned RAX */
>                 if (svm->vmcb->save.rax & ~PAGE_MASK)
>                         goto reinject;
>
> So I think what we want is to
>
>   (a) fix the RAX check in gp_interception()
>   (b) drop the RAX check in the emulator
>   (c) add a CPL check in the emulator (because the intercepted #GP could have
>       been due to L2 executing at CPL>0, not due to a bad-but-good RAX).

Agree with this (also thanks for pointing out page_address_valid()).
Will add 3 patches doing this at the beginning of the series instead
of this one.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-11 23:22                       ` Yosry Ahmed
@ 2026-03-12  1:27                         ` Yosry Ahmed
  2026-03-12  1:38                           ` Sean Christopherson
  0 siblings, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-12  1:27 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

> > Hold up, we're getting ahead of ourselves.
> >
> > The only legitimate reason the emulator is at all aware of VMSAVE, VMLOAD, and
> > VMRUN is to deal with #GP due to the RAX check, because hardware checks the GPA
> > against the host's physical address space.  See commit 82a11e9c6fa2 ("KVM: SVM:
> > Add emulation support for #GP triggered by SVM instructions").
> >
> > The emulator "support" was originally added by commit 01de8b09e606 ("KVM: SVM:
> > Add intercept checks for SVM instructions"), but AFAICT, for all intents and
> > purposes that was dead code when it was added, because the emulator doesn't
> > actually _emulate_ the instructions.  I assume if they aren't intercepted, and
> > KVM is full on emulating instead of just decoding, they end up at EMULATION_FAILED
> > and get a #UD or something.
> >
> > Outside of forced emulation or code stream rewriting, KVM should _never_ fully
> > emulate any of the SVM instructions except VMMCALL (and that is a super special
> > case).  KVM does need to _decode_ the instruction, and it needs to get the
> > pre-intercept exception checks correct so that KVM correctly injects e.g. #GP
> > instead of synthesizing a #VMEXIT for the CPL check, but KVM doesn't need to do
> > *all* of the checks.
> >
> > Note, for L2, the SVME check is meaningless, as EFER.SVME has to be set for L2
> > to be active, i.e. it's L1's responsibility to handle that check.
> >
> > Back to the physical address thing, KVM _already_ handles that check in the #GP
> > path,
>
> I guess if KVM is not intercepting #GP, then the hardware injects the
> #GP and the emulator still doesn't have to worry about it -- because
> we don't support the case where RAX can be legal from the host's
> perspective but not the guest's. Makes sense.
>
> > it's just wrong too:
> >
> >                 /* All SVM instructions expect page aligned RAX */
> >                 if (svm->vmcb->save.rax & ~PAGE_MASK)
> >                         goto reinject;
> >
> > So I think what we want is to
> >
> >   (a) fix the RAX check in gp_interception()
> >   (b) drop the RAX check in the emulator
> >   (c) add a CPL check in the emulator (because the intercepted #GP could have
> >       been due to L2 executing at CPL>0, not due to a bad-but-good RAX).

Actually, I don't think (c) is needed. In the path where KVM
intercepts #GP, it doesn't go through the emulation path which ends up
calling check_svme(), it only uses the emulator to decode the
instruction.

AFAICT, we can end up in the emulator only when the CPU does not
produce a #GP, e.g. when we get a #NPF on the address in RAX. In this
case, the CPU will have already checked the CPL for us, and the
validity of the address. The emulator checking EFER.SVME check is
probably also useless, because with Kevin's patches we should always
be intercepting VMLOAD/VMSAVE when EFER.SVME is disabled by the guest
and checking EFER.SVME there anyway.

Anyway, I want to touch the emulator as little as possible tbh, so I
will still do (b) because it unblocks this series (removes the wrong
GPA check that injects #GP), but will defer any further cleanups.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-12  1:27                         ` Yosry Ahmed
@ 2026-03-12  1:38                           ` Sean Christopherson
  0 siblings, 0 replies; 30+ messages in thread
From: Sean Christopherson @ 2026-03-12  1:38 UTC (permalink / raw)
  To: Yosry Ahmed; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

On Wed, Mar 11, 2026, Yosry Ahmed wrote:
> > > Hold up, we're getting ahead of ourselves.
> > >
> > > The only legitimate reason the emulator is at all aware of VMSAVE, VMLOAD, and
> > > VMRUN is to deal with #GP due to the RAX check, because hardware checks the GPA
> > > against the host's physical address space.  See commit 82a11e9c6fa2 ("KVM: SVM:
> > > Add emulation support for #GP triggered by SVM instructions").
> > >
> > > The emulator "support" was originally added by commit 01de8b09e606 ("KVM: SVM:
> > > Add intercept checks for SVM instructions"), but AFAICT, for all intents and
> > > purposes that was dead code when it was added, because the emulator doesn't
> > > actually _emulate_ the instructions.  I assume if they aren't intercepted, and
> > > KVM is full on emulating instead of just decoding, they end up at EMULATION_FAILED
> > > and get a #UD or something.
> > >
> > > Outside of forced emulation or code stream rewriting, KVM should _never_ fully
> > > emulate any of the SVM instructions except VMMCALL (and that is a super special
> > > case).  KVM does need to _decode_ the instruction, and it needs to get the
> > > pre-intercept exception checks correct so that KVM correctly injects e.g. #GP
> > > instead of synthesizing a #VMEXIT for the CPL check, but KVM doesn't need to do
> > > *all* of the checks.
> > >
> > > Note, for L2, the SVME check is meaningless, as EFER.SVME has to be set for L2
> > > to be active, i.e. it's L1's responsibility to handle that check.
> > >
> > > Back to the physical address thing, KVM _already_ handles that check in the #GP
> > > path,
> >
> > I guess if KVM is not intercepting #GP, then the hardware injects the
> > #GP and the emulator still doesn't have to worry about it -- because
> > we don't support the case where RAX can be legal from the host's
> > perspective but not the guest's. Makes sense.
> >
> > > it's just wrong too:
> > >
> > >                 /* All SVM instructions expect page aligned RAX */
> > >                 if (svm->vmcb->save.rax & ~PAGE_MASK)
> > >                         goto reinject;
> > >
> > > So I think what we want is to
> > >
> > >   (a) fix the RAX check in gp_interception()
> > >   (b) drop the RAX check in the emulator
> > >   (c) add a CPL check in the emulator (because the intercepted #GP could have
> > >       been due to L2 executing at CPL>0, not due to a bad-but-good RAX).
> 
> Actually, I don't think (c) is needed. In the path where KVM
> intercepts #GP, it doesn't go through the emulation path which ends up
> calling check_svme(), it only uses the emulator to decode the
> instruction.

Oh, dagnabbit, that's stupidly obvious in hindsight.

> AFAICT, we can end up in the emulator only when the CPU does not
> produce a #GP, e.g. when we get a #NPF on the address in RAX. In this
> case, the CPU will have already checked the CPL for us, and the
> validity of the address. The emulator checking EFER.SVME check is
> probably also useless, because with Kevin's patches we should always
> be intercepting VMLOAD/VMSAVE when EFER.SVME is disabled by the guest
> and checking EFER.SVME there anyway.
> 
> Anyway, I want to touch the emulator as little as possible tbh, so I
> will still do (b) because it unblocks this series (removes the wrong
> GPA check that injects #GP), but will defer any further cleanups.

Yeah, works for me.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-11 23:01                     ` Sean Christopherson
  2026-03-11 23:22                       ` Yosry Ahmed
@ 2026-03-12 15:50                       ` Yosry Ahmed
  2026-03-12 15:54                         ` Sean Christopherson
  1 sibling, 1 reply; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-12 15:50 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index c8e292e9a24d..74df977a38ca 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -3867,18 +3867,10 @@ static int check_svme(struct x86_emulate_ctxt *ctxt)
>         if (!(efer & EFER_SVME))
>                 return emulate_ud(ctxt);
>
> -       return X86EMUL_CONTINUE;
> -}
> -
> -static int check_svme_pa(struct x86_emulate_ctxt *ctxt)
> -{
> -       u64 rax = reg_read(ctxt, VCPU_REGS_RAX);
> -
> -       /* Valid physical address? */
> -       if (rax & 0xffff000000000000ULL)
> +       if (ctxt->ops->cpl(ctxt))
>                 return emulate_gp(ctxt, 0);
>
> -       return check_svme(ctxt);
> +       return X86EMUL_CONTINUE;
>  }
>
>  static int check_rdtsc(struct x86_emulate_ctxt *ctxt)
> @@ -3984,10 +3976,10 @@ static const struct opcode group7_rm2[] = {
>  };
>
>  static const struct opcode group7_rm3[] = {
> -       DIP(SrcNone | Prot | Priv,              vmrun,          check_svme_pa),
> +       DIP(SrcNone | Prot | Priv,              vmrun,          check_svme),
>         II(SrcNone  | Prot | EmulateOnUD,       em_hypercall,   vmmcall),
> -       DIP(SrcNone | Prot | Priv,              vmload,         check_svme_pa),
> -       DIP(SrcNone | Prot | Priv,              vmsave,         check_svme_pa),
> +       DIP(SrcNone | Prot | Priv,              vmload,         check_svme),
> +       DIP(SrcNone | Prot | Priv,              vmsave,         check_svme),
>         DIP(SrcNone | Prot | Priv,              stgi,           check_svme),
>         DIP(SrcNone | Prot | Priv,              clgi,           check_svme),
>         DIP(SrcNone | Prot | Priv,              skinit,         check_svme),
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index e6691c044913..e1223c07593b 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2294,7 +2294,7 @@ static int gp_interception(struct kvm_vcpu *vcpu)
>                                 EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
>         } else {
>                 /* All SVM instructions expect page aligned RAX */
> -               if (svm->vmcb->save.rax & ~PAGE_MASK)
> +               if (!page_address_valid(vcpu, svm->vmcb->save.rax))
>                         goto reinject;

Final observation (hopefully), this check needs to be moved to the
VMRUN/VMLOAD/VMSAVE interception functions. As kvm_vcpu_map() failures
will stop injecting #GP, we still need to handle the case where
allow_smaller_maxphyaddr is used and the GPA is illegal from the
vCPU's perspective but not the host. In this case, the CPU does not
inject a #GP and the instructions are intercepted, so we need to make
sure the interception functions do the legality check on RAX.

I am doing this in a separate patch after fixing the check, and
opportunistically cleaning up gp_interception() by doing an early
return for the non-SVM insn case to reduce indentation. Will send v3
out after I am done testing.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-12 15:50                       ` Yosry Ahmed
@ 2026-03-12 15:54                         ` Sean Christopherson
  2026-03-12 16:19                           ` Yosry Ahmed
  0 siblings, 1 reply; 30+ messages in thread
From: Sean Christopherson @ 2026-03-12 15:54 UTC (permalink / raw)
  To: Yosry Ahmed; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

On Thu, Mar 12, 2026, Yosry Ahmed wrote:
> > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> > index c8e292e9a24d..74df977a38ca 100644
> > --- a/arch/x86/kvm/emulate.c
> > +++ b/arch/x86/kvm/emulate.c
> > @@ -3867,18 +3867,10 @@ static int check_svme(struct x86_emulate_ctxt *ctxt)
> >         if (!(efer & EFER_SVME))
> >                 return emulate_ud(ctxt);
> >
> > -       return X86EMUL_CONTINUE;
> > -}
> > -
> > -static int check_svme_pa(struct x86_emulate_ctxt *ctxt)
> > -{
> > -       u64 rax = reg_read(ctxt, VCPU_REGS_RAX);
> > -
> > -       /* Valid physical address? */
> > -       if (rax & 0xffff000000000000ULL)
> > +       if (ctxt->ops->cpl(ctxt))
> >                 return emulate_gp(ctxt, 0);
> >
> > -       return check_svme(ctxt);
> > +       return X86EMUL_CONTINUE;
> >  }
> >
> >  static int check_rdtsc(struct x86_emulate_ctxt *ctxt)
> > @@ -3984,10 +3976,10 @@ static const struct opcode group7_rm2[] = {
> >  };
> >
> >  static const struct opcode group7_rm3[] = {
> > -       DIP(SrcNone | Prot | Priv,              vmrun,          check_svme_pa),
> > +       DIP(SrcNone | Prot | Priv,              vmrun,          check_svme),
> >         II(SrcNone  | Prot | EmulateOnUD,       em_hypercall,   vmmcall),
> > -       DIP(SrcNone | Prot | Priv,              vmload,         check_svme_pa),
> > -       DIP(SrcNone | Prot | Priv,              vmsave,         check_svme_pa),
> > +       DIP(SrcNone | Prot | Priv,              vmload,         check_svme),
> > +       DIP(SrcNone | Prot | Priv,              vmsave,         check_svme),
> >         DIP(SrcNone | Prot | Priv,              stgi,           check_svme),
> >         DIP(SrcNone | Prot | Priv,              clgi,           check_svme),
> >         DIP(SrcNone | Prot | Priv,              skinit,         check_svme),
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index e6691c044913..e1223c07593b 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -2294,7 +2294,7 @@ static int gp_interception(struct kvm_vcpu *vcpu)
> >                                 EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
> >         } else {
> >                 /* All SVM instructions expect page aligned RAX */
> > -               if (svm->vmcb->save.rax & ~PAGE_MASK)
> > +               if (!page_address_valid(vcpu, svm->vmcb->save.rax))
> >                         goto reinject;
> 
> Final observation (hopefully), this check needs to be moved to the
> VMRUN/VMLOAD/VMSAVE interception functions.

Gah, yeah.  I noticed that when initially typing up my response, but lost track
of it when I got distracted by all the emulator crud.

> As kvm_vcpu_map() failures will stop injecting #GP, we still need to handle
> the case where allow_smaller_maxphyaddr is used and the GPA is illegal from
> the vCPU's perspective but not the host.

allow_smaller_maxphyaddr is irrelevant.  My read of the APM is that the intercept
has priority over the #GP due to a bad RAX.  So with vls=0, KVM needs to check
RAX irrespective of allow_smaller_maxphyaddr.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE
  2026-03-12 15:54                         ` Sean Christopherson
@ 2026-03-12 16:19                           ` Yosry Ahmed
  0 siblings, 0 replies; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-12 16:19 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, Paolo Bonzini, kvm, linux-kernel

> > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > > index e6691c044913..e1223c07593b 100644
> > > --- a/arch/x86/kvm/svm/svm.c
> > > +++ b/arch/x86/kvm/svm/svm.c
> > > @@ -2294,7 +2294,7 @@ static int gp_interception(struct kvm_vcpu *vcpu)
> > >                                 EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
> > >         } else {
> > >                 /* All SVM instructions expect page aligned RAX */
> > > -               if (svm->vmcb->save.rax & ~PAGE_MASK)
> > > +               if (!page_address_valid(vcpu, svm->vmcb->save.rax))
> > >                         goto reinject;
> >
> > Final observation (hopefully), this check needs to be moved to the
> > VMRUN/VMLOAD/VMSAVE interception functions.
>
> Gah, yeah.  I noticed that when initially typing up my response, but lost track
> of it when I got distracted by all the emulator crud.
>
> > As kvm_vcpu_map() failures will stop injecting #GP, we still need to handle
> > the case where allow_smaller_maxphyaddr is used and the GPA is illegal from
> > the vCPU's perspective but not the host.
>
> allow_smaller_maxphyaddr is irrelevant.  My read of the APM is that the intercept
> has priority over the #GP due to a bad RAX.  So with vls=0, KVM needs to check
> RAX irrespective of allow_smaller_maxphyaddr.

Oh yeah, good point.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/6] KVM: nSVM: Simplify error handling of nested_svm_copy_vmcb12_to_cache()
  2026-03-06 21:08 ` [PATCH v2 2/6] KVM: nSVM: Simplify error handling of nested_svm_copy_vmcb12_to_cache() Yosry Ahmed
@ 2026-03-12 18:13   ` Sean Christopherson
  2026-03-12 21:01     ` Yosry Ahmed
  0 siblings, 1 reply; 30+ messages in thread
From: Sean Christopherson @ 2026-03-12 18:13 UTC (permalink / raw)
  To: Yosry Ahmed; +Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel

On Fri, Mar 06, 2026, Yosry Ahmed wrote:
> nested_svm_vmrun() currently stores the return value of
> nested_svm_copy_vmcb12_to_cache() in a local variable 'err', separate
> from the generally used 'ret' variable. This is done to have a single
> call to kvm_skip_emulated_instruction(), such that we can store the
> return value of kvm_skip_emulated_instruction() in 'ret', and then
> re-check the return value of nested_svm_copy_vmcb12_to_cache() in 'err'.
> 
> The code is unnecessarily confusing. Instead, call
> kvm_skip_emulated_instruction() in the failure path of
> nested_svm_copy_vmcb12_to_cache() if the return value is not -EFAULT,
> and drop 'err'.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Yosry Ahmed <yosry@kernel.org>

FYI, I'm going to grab this right now to make it slightly easier to resolve the
merge conflict with Paolo's SMM fixes (the ret vs. err stuff is so confusing).

> ---
>  arch/x86/kvm/svm/nested.c | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index b191c6cab57db..6d4c053778b21 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1079,7 +1079,7 @@ static int nested_svm_copy_vmcb12_to_cache(struct kvm_vcpu *vcpu, u64 vmcb12_gpa
>  int nested_svm_vmrun(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_svm *svm = to_svm(vcpu);
> -	int ret, err;
> +	int ret;
>  	u64 vmcb12_gpa;
>  	struct vmcb *vmcb01 = svm->vmcb01.ptr;
>  
> @@ -1104,19 +1104,20 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
>  		return -EINVAL;
>  
>  	vmcb12_gpa = svm->vmcb->save.rax;
> -	err = nested_svm_copy_vmcb12_to_cache(vcpu, vmcb12_gpa);
> -	if (err == -EFAULT) {
> +	ret = nested_svm_copy_vmcb12_to_cache(vcpu, vmcb12_gpa);
> +
> +	/*
> +	 * Advance RIP if #GP or #UD are not injected, but otherwise
> +	 * stop if copying and checking vmcb12 failed.
> +	 */
> +	if (ret == -EFAULT) {
>  		kvm_inject_gp(vcpu, 0);
>  		return 1;
> +	} else if (ret) {
> +		return kvm_skip_emulated_instruction(vcpu);
>  	}

I strongly dislike the if-elif approach, because it makes unnecessarily hard to
see that *all* ret !=0 cases are handled, i.e. that overwriting ret below is ok.

The comment is also super confusing, because there's no #UD in sight, but there
is a #GP. 

This is what I have locally and am planning on pushing to kvm-x86/next.

	ret = nested_svm_copy_vmcb12_to_cache(vcpu, vmcb12_gpa);
	if (ret) {
		if (ret == -EFAULT) {
			kvm_inject_gp(vcpu, 0);
			return 1;
		}

		/* Advance RIP past VMRUN as part of the nested #VMEXIT. */
		return kvm_skip_emulated_instruction(vcpu);
	}

	/* At this point, VMRUN is guaranteed to not fault; advance RIP. */
	ret = kvm_skip_emulated_instruction(vcpu);

>  
> -	/*
> -	 * Advance RIP if #GP or #UD are not injected, but otherwise stop if
> -	 * copying and checking vmcb12 failed.
> -	 */
>  	ret = kvm_skip_emulated_instruction(vcpu);
> -	if (err)
> -		return ret;
>  
>  	/*
>  	 * Since vmcb01 is not in use, we can use it to store some of the L1
> -- 
> 2.53.0.473.g4a7958ca14-goog
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/6] KVM: nSVM: Simplify error handling of nested_svm_copy_vmcb12_to_cache()
  2026-03-12 18:13   ` Sean Christopherson
@ 2026-03-12 21:01     ` Yosry Ahmed
  0 siblings, 0 replies; 30+ messages in thread
From: Yosry Ahmed @ 2026-03-12 21:01 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, Jim Mattson, kvm, linux-kernel

> I strongly dislike the if-elif approach, because it makes unnecessarily hard to
> see that *all* ret !=0 cases are handled, i.e. that overwriting ret below is ok.
>
> The comment is also super confusing, because there's no #UD in sight, but there
> is a #GP.
>
> This is what I have locally and am planning on pushing to kvm-x86/next.
>
>         ret = nested_svm_copy_vmcb12_to_cache(vcpu, vmcb12_gpa);
>         if (ret) {
>                 if (ret == -EFAULT) {
>                         kvm_inject_gp(vcpu, 0);
>                         return 1;
>                 }
>
>                 /* Advance RIP past VMRUN as part of the nested #VMEXIT. */
>                 return kvm_skip_emulated_instruction(vcpu);
>         }
>
>         /* At this point, VMRUN is guaranteed to not fault; advance RIP. */
>         ret = kvm_skip_emulated_instruction(vcpu);

Looks good. I will rebase on top of this and drop the patch before
sending the next version.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2026-03-12 21:02 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-06 21:08 [PATCH v2 0/6] KVM: nSVM: Fix vmcb12 mapping failure handling Yosry Ahmed
2026-03-06 21:08 ` [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE Yosry Ahmed
2026-03-06 22:27   ` Jim Mattson
2026-03-06 22:37     ` Yosry Ahmed
2026-03-06 23:12       ` Jim Mattson
2026-03-06 23:20         ` Yosry Ahmed
2026-03-06 23:45           ` Jim Mattson
2026-03-07  0:32           ` Sean Christopherson
2026-03-11 18:31             ` Yosry Ahmed
2026-03-11 20:07               ` Yosry Ahmed
2026-03-11 20:39                 ` Sean Christopherson
2026-03-11 20:50                   ` Yosry Ahmed
2026-03-11 23:01                     ` Sean Christopherson
2026-03-11 23:22                       ` Yosry Ahmed
2026-03-12  1:27                         ` Yosry Ahmed
2026-03-12  1:38                           ` Sean Christopherson
2026-03-12 15:50                       ` Yosry Ahmed
2026-03-12 15:54                         ` Sean Christopherson
2026-03-12 16:19                           ` Yosry Ahmed
2026-03-07  0:28         ` Sean Christopherson
2026-03-07  0:31           ` Yosry Ahmed
2026-03-06 21:08 ` [PATCH v2 2/6] KVM: nSVM: Simplify error handling of nested_svm_copy_vmcb12_to_cache() Yosry Ahmed
2026-03-12 18:13   ` Sean Christopherson
2026-03-12 21:01     ` Yosry Ahmed
2026-03-06 21:08 ` [PATCH v2 3/6] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation Yosry Ahmed
2026-03-06 21:08 ` [PATCH v2 4/6] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails Yosry Ahmed
2026-03-07  1:09   ` Yosry Ahmed
2026-03-09 13:56     ` Yosry Ahmed
2026-03-06 21:08 ` [PATCH v2 5/6] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa Yosry Ahmed
2026-03-06 21:09 ` [PATCH v2 6/6] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name Yosry Ahmed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox