The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH 0/3] KVM: harden and cleanup PDPTR load on forced L1 reload
@ 2026-06-04 16:07 Paolo Bonzini
  2026-06-04 16:07 ` [PATCH 1/3] KVM: nVMX: unwind PDPTR load if processor triggers a nested VMFail Paolo Bonzini
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Paolo Bonzini @ 2026-06-04 16:07 UTC (permalink / raw)
  To: linux-kernel, kvm

Nested VMX tries to detect all possible VMFail cases before committing
to execute VMLAUNCH/VMRESUME, because the steps leading to the VMCS02
VM entry need to load various bits of the L2 processor state into
KVM's software model (vcpu->arch, essentially).

However, there are cases that are not caught or in some cases even
racy because the data comes from memory rather than from the VMCS;
in that case the earlier load of L2 state needs to be unwound, and
nested_vmx_restore_host_state() exists for that purpose.  Sashiko
found a hole where L1's CR3 is restored there upon a VM-Entry failure,
but the PDPTRs are only restored if EPT is enabled. If shadow paging
is used, the L2 PDPTRs from the aborted entry attempt will remain in
vcpu->arch.root_mmu->pdptrs.

Fix this by forcing use of nested_vmx_load_cr3(), in the same guise
as load_vmcs12_host_state().

The other two patches are respectively another minor
hardening/clarification, and a small optimization.

Paolo Bonzini (3):
  KVM: nVMX: unwind PDPTR load if processor triggers a nested VMFail
  KVM: MMU: unconditionally clear MMIO cache on root rebuild
  KVM: nVMX: remove unnecessary unload on processor-detected VMFail

 arch/x86/kvm/mmu/mmu.c    |  1 +
 arch/x86/kvm/vmx/nested.c | 15 +++++++--------
 2 files changed, 8 insertions(+), 8 deletions(-)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] KVM: nVMX: unwind PDPTR load if processor triggers a nested VMFail
  2026-06-04 16:07 [PATCH 0/3] KVM: harden and cleanup PDPTR load on forced L1 reload Paolo Bonzini
@ 2026-06-04 16:07 ` Paolo Bonzini
  2026-06-09  3:31   ` Sean Christopherson
  2026-06-04 16:07 ` [PATCH 2/3] KVM: MMU: unconditionally clear MMIO cache on root rebuild Paolo Bonzini
  2026-06-04 16:07 ` [PATCH 3/3] KVM: nVMX: remove unnecessary unload on processor-detected VMFail Paolo Bonzini
  2 siblings, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2026-06-04 16:07 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: stable

Upon a VM-entry failure that is caught by the processor rather than
KVM, nested_vmx_restore_host_state() restores L1's CR3 but not the
PDPTRs.  If shadow paging is used (enable_ept is false), the L2
PDPTRs loaded during the aborted entry attempt remain in
vcpu->arch.mmu->pdptrs[].

Note that the fact that the PDPTRs are stored in the MMU does not
save the day, because KVM only uses root_mmu if enable_ept is false.

To fix this, use nested_vmx_load_cr3() instead of open coding
just the load of vcpu->arch.cr3, in the same guise as
load_vmcs12_host_state().  nested_vmx_load_cr3() will mark the
register as dirty rather than available, but this is only a
very minor pessimization.

If EPT *is* in use, do not load the PDPTRs and rely solely on
ept_save_pdptrs() to reload them from VMCS01.  When vmx_load_mmu_pgd()
runs on the next entry, the PDPTRs are available---meaning they are
not incorrectly reloaded from memory.

kvm_mmu_unload() is preserved to keep the paths from the old
kvm_mmu_reset_context(), but is actually unnecessary.  It can
be removed as a separate patch.

Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4690a4d23709..d612a5d071fc 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4947,6 +4947,7 @@ static inline u64 nested_vmx_get_vmcs01_guest_efer(struct vcpu_vmx *vmx)
 
 static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
 {
+	enum vm_entry_failure_code ignored;
 	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	struct vmx_msr_entry g, h;
@@ -4984,20 +4985,19 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
 	vmx_set_cr4(vcpu, vmcs_readl(CR4_READ_SHADOW));
 
 	nested_ept_uninit_mmu_context(vcpu);
-	vcpu->arch.cr3 = vmcs_readl(GUEST_CR3);
-	kvm_register_mark_available(vcpu, VCPU_REG_CR3);
 
 	/*
-	 * Use ept_save_pdptrs(vcpu) to load the MMU's cached PDPTRs
-	 * from vmcs01 (if necessary).  The PDPTRs are not loaded on
-	 * VMFail, like everything else we just need to ensure our
-	 * software model is up-to-date.
+	 * Now that nested EPT has been disabled, load the MMU's CR3 and
+	 * possibly PDPTRs from vmcs01 (if necessary).  This should not
+	 * happen for VMFail, but we get here if the check was caught by
+	 * the processor and therefore the guest CR3 was loaded prematurely.
 	 */
+	kvm_mmu_unload(vcpu);
+	if (nested_vmx_load_cr3(vcpu, vmcs_readl(GUEST_CR3), false, !enable_ept, &ignored))
+		nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);
 	if (enable_ept && is_pae_paging(vcpu))
 		ept_save_pdptrs(vcpu);
 
-	kvm_mmu_reset_context(vcpu);
-
 	/*
 	 * This nasty bit of open coding is a compromise between blindly
 	 * loading L1's MSRs using the exit load lists (incorrect emulation
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] KVM: MMU: unconditionally clear MMIO cache on root rebuild
  2026-06-04 16:07 [PATCH 0/3] KVM: harden and cleanup PDPTR load on forced L1 reload Paolo Bonzini
  2026-06-04 16:07 ` [PATCH 1/3] KVM: nVMX: unwind PDPTR load if processor triggers a nested VMFail Paolo Bonzini
@ 2026-06-04 16:07 ` Paolo Bonzini
  2026-06-09 20:24   ` Sean Christopherson
  2026-06-04 16:07 ` [PATCH 3/3] KVM: nVMX: remove unnecessary unload on processor-detected VMFail Paolo Bonzini
  2 siblings, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2026-06-04 16:07 UTC (permalink / raw)
  To: linux-kernel, kvm

Upon changing CR3, the MMIO cache becomes invalid because the
GVA->GPA mapping has changed.  However, kvm_load_new_pgd() calls
vcpu_clear_mmio_info() call only if the fast switch succeeded.
The early-return path instead leaves the root invalid; the next entry
then calls kvm_mmu_reload() and from there kvm_mmu_load().

kvm_mmu_load() calls kvm_mmu_sync_roots(), which clears the MMIO
cache, but one combination that falls through is root_role.direct==1,
i.e. CR0.PG=0, for which kvm_mmu_sync_roots() bails before reaching the
call to vcpu_clear_mmio_info().

That combination is barely reachable: a valid direct root is pretty much
always a fast-switch success because it does not check the PGD for a
match.  The early return for a direct root thus requires the current root
to already be invalid, and kvm_mmu_unload() itself clears the MMIO cache.

That said, doing an independent clear in the style of kvm_mmu_new_pgd()
is more obviously correct and basically free, so harden it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index f8aa7eda661e..6689c9f8ae16 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6138,6 +6138,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
 	if (r)
 		goto out;
 
+	vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY);
 	kvm_mmu_sync_roots(vcpu);
 
 	kvm_mmu_load_pgd(vcpu);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] KVM: nVMX: remove unnecessary unload on processor-detected VMFail
  2026-06-04 16:07 [PATCH 0/3] KVM: harden and cleanup PDPTR load on forced L1 reload Paolo Bonzini
  2026-06-04 16:07 ` [PATCH 1/3] KVM: nVMX: unwind PDPTR load if processor triggers a nested VMFail Paolo Bonzini
  2026-06-04 16:07 ` [PATCH 2/3] KVM: MMU: unconditionally clear MMIO cache on root rebuild Paolo Bonzini
@ 2026-06-04 16:07 ` Paolo Bonzini
  2 siblings, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2026-06-04 16:07 UTC (permalink / raw)
  To: linux-kernel, kvm

nested_vmx_restore_host_state() is following a similar scheme to
load_vmcs12_host_state() which does not need a kvm_mmu_unload().
So, does nested_vmx_restore_host_state() need it?

The answer is no.  In the shadow case, kvm_init_mmu()
in nested_vmx_load_cr3() is enough to set a root_role
with guest_mode==0.  kvm_mmu_new_pgd() then is now
able to reuse an old root.  In the EPT case, root_mmu
still holds L1's valid root because L2 used guest_mmu.

Removing kvm_mmu_unload() thus is marginally more
efficient and it makes the two host state restore paths
identical.

The other thing that kvm_mmu_unload() does is clearing
the MMIO GVA cache.  This was ensured previously by
calling vcpu_clear_mmio_info() from kvm_mmu_load()
rather than just kvm_mmu_new_pgd().

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index d612a5d071fc..8b20a5eac1c9 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4992,7 +4992,6 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
 	 * happen for VMFail, but we get here if the check was caught by
 	 * the processor and therefore the guest CR3 was loaded prematurely.
 	 */
-	kvm_mmu_unload(vcpu);
 	if (nested_vmx_load_cr3(vcpu, vmcs_readl(GUEST_CR3), false, !enable_ept, &ignored))
 		nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);
 	if (enable_ept && is_pae_paging(vcpu))
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/3] KVM: nVMX: unwind PDPTR load if processor triggers a nested VMFail
  2026-06-04 16:07 ` [PATCH 1/3] KVM: nVMX: unwind PDPTR load if processor triggers a nested VMFail Paolo Bonzini
@ 2026-06-09  3:31   ` Sean Christopherson
  0 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-06-09  3:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, stable

On Thu, Jun 04, 2026, Paolo Bonzini wrote:
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 4690a4d23709..d612a5d071fc 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4947,6 +4947,7 @@ static inline u64 nested_vmx_get_vmcs01_guest_efer(struct vcpu_vmx *vmx)
>  
>  static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
>  {
> +	enum vm_entry_failure_code ignored;
>  	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  	struct vmx_msr_entry g, h;
> @@ -4984,20 +4985,19 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
>  	vmx_set_cr4(vcpu, vmcs_readl(CR4_READ_SHADOW));
>  
>  	nested_ept_uninit_mmu_context(vcpu);
> -	vcpu->arch.cr3 = vmcs_readl(GUEST_CR3);
> -	kvm_register_mark_available(vcpu, VCPU_REG_CR3);
>  
>  	/*
> -	 * Use ept_save_pdptrs(vcpu) to load the MMU's cached PDPTRs
> -	 * from vmcs01 (if necessary).  The PDPTRs are not loaded on
> -	 * VMFail, like everything else we just need to ensure our
> -	 * software model is up-to-date.
> +	 * Now that nested EPT has been disabled, load the MMU's CR3 and
> +	 * possibly PDPTRs from vmcs01 (if necessary).  This should not
> +	 * happen for VMFail, but we get here if the check was caught by
> +	 * the processor and therefore the guest CR3 was loaded prematurely.
>  	 */
> +	kvm_mmu_unload(vcpu);
> +	if (nested_vmx_load_cr3(vcpu, vmcs_readl(GUEST_CR3), false, !enable_ept, &ignored))
> +		nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);

This isn't quite correct either.  I mean, none of this is architecturally correct,
but this is less correct than the other incorrect code here :-)

To do this "right", KVM should snapshot the PDPTRs and shove them into the MMU,
without touching guest memory.

On a very related topic, I have a patch to stash CR3 in software instead of
abusing vmcs01.GUEST_CR3, as KVM fails to restore vmcs01.GUEST_CR3 to its proper
state if nested_vmx_enter_non_root_mode() bails after clobbering vmcs01.GUEST_CR3,
but before loading guest state.  We could probably do the same thing for PDPTRs?

https://lore.kernel.org/all/20260603223418.1720035-3-seanjc@google.com

>  	if (enable_ept && is_pae_paging(vcpu))
>  		ept_save_pdptrs(vcpu);
>  
> -	kvm_mmu_reset_context(vcpu);
> -
>  	/*
>  	 * This nasty bit of open coding is a compromise between blindly
>  	 * loading L1's MSRs using the exit load lists (incorrect emulation
> -- 
> 2.52.0
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/3] KVM: MMU: unconditionally clear MMIO cache on root rebuild
  2026-06-04 16:07 ` [PATCH 2/3] KVM: MMU: unconditionally clear MMIO cache on root rebuild Paolo Bonzini
@ 2026-06-09 20:24   ` Sean Christopherson
  0 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-06-09 20:24 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm

On Thu, Jun 04, 2026, Paolo Bonzini wrote:
> Upon changing CR3, the MMIO cache becomes invalid because the
> GVA->GPA mapping has changed.  However, kvm_load_new_pgd() calls

s/kvm_load_new_pgd/kvm_mmu_new_pgd

> vcpu_clear_mmio_info() call only if the fast switch succeeded.
> The early-return path instead leaves the root invalid; the next entry
> then calls kvm_mmu_reload() and from there kvm_mmu_load().
> 
> kvm_mmu_load() calls kvm_mmu_sync_roots(), which clears the MMIO
> cache, but one combination that falls through is root_role.direct==1,
> i.e. CR0.PG=0, for which kvm_mmu_sync_roots() bails before reaching the
> call to vcpu_clear_mmio_info().
> 
> That combination is barely reachable: a valid direct root is pretty much
> always a fast-switch success because it does not check the PGD for a
> match.  The early return for a direct root thus requires the current root
> to already be invalid, and kvm_mmu_unload() itself clears the MMIO cache.
> 
> That said, doing an independent clear in the style of kvm_mmu_new_pgd()
> is more obviously correct and basically free, so harden it.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index f8aa7eda661e..6689c9f8ae16 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6138,6 +6138,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
>  	if (r)
>  		goto out;
>  
> +	vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY);
>  	kvm_mmu_sync_roots(vcpu);


I don't dislike this change, but I don't really like it either.  MMIO caching as
a whole is a mess of spaghetti.  E.g. handle_mmio_page_fault() will never use or
cache a gva for the CR0.PG=0 case, and so the only way for this to be reachable
in any capacity is if KVM emulates an instruction, in response to something *other*
than a #PF (otherwise KVM will use the GPA), immediately after inserting into the
cache via kvm_handle_noslot_fault().

IMO, flushing cache here, when it's _just_ barely necessary is rather confusing.
E.g. kvm_mmu_sync_roots() obviously handles the shadow paging case, and TDP can't
ever use the GVA-based cache because KVM doesn't track CR3 changes (and it's
*really* hard to see that TDP is safe/fine).  We could "fix" that with a comment,
but rather than go through all that effort to support a path that in all likelihood
never gets a cache hit, or at least never gets enough hits to provide meaningful
value, we actually make direct MMUs completely mutually exclusive with GVA-based
caching?

It's actually a fairly nice cleanup, and would entirely obviate the need to worry
about flushing the cache for CR0.PG=0.

E.g. over a few patches (completely untested):

---
 arch/x86/kvm/mmu/mmu.c | 29 ++++++++++++-----------------
 arch/x86/kvm/x86.h     | 24 +++++++++++++++++-------
 2 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 91843e9224d0..42ac9986d73b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3533,14 +3533,12 @@ static int kvm_handle_noslot_fault(struct kvm_vcpu *vcpu,
 				   struct kvm_page_fault *fault,
 				   unsigned int access)
 {
-	gva_t gva = fault->is_tdp ? 0 : fault->addr;
-
 	if (fault->is_private) {
 		kvm_mmu_prepare_memory_fault_exit(vcpu, fault);
 		return -EFAULT;
 	}
 
-	vcpu_cache_mmio_info(vcpu, gva, fault->gfn,
+	vcpu_cache_mmio_info(vcpu, fault->addr, fault->gfn,
 			     access & shadow_mmio_access_mask);
 
 	fault->slot = NULL;
@@ -4364,19 +4362,20 @@ static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 	return kvm_translate_gpa(vcpu, mmu, vaddr, access, exception);
 }
 
-static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
+static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 cr2_or_gpa)
 {
 	/*
-	 * A nested guest cannot use the MMIO cache if it is using nested
-	 * page tables, because cr2 is a nGPA while the cache stores GPAs.
+	 * A nested guest cannot use the software MMIO cache if it is using
+	 * nested TDP, because the address at the time of fault is a nGPA,
+	 * while the cache stores GPAs.
 	 */
 	if (mmu_is_nested(vcpu))
 		return false;
 
-	if (direct)
-		return vcpu_match_mmio_gpa(vcpu, addr);
+	if (vcpu_can_cache_mmio_gva(vcpu))
+		return vcpu_match_mmio_gva(vcpu, cr2_or_gpa);
 
-	return vcpu_match_mmio_gva(vcpu, addr);
+	return vcpu_match_mmio_gpa(vcpu, cr2_or_gpa);
 }
 
 /*
@@ -4462,12 +4461,12 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 	return reserved;
 }
 
-static int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr, bool direct)
+static int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr)
 {
 	u64 spte;
 	bool reserved;
 
-	if (mmio_info_in_cache(vcpu, addr, direct))
+	if (mmio_info_in_cache(vcpu, addr))
 		return RET_PF_EMULATE;
 
 	reserved = get_mmio_spte(vcpu, addr, &spte);
@@ -4481,9 +4480,6 @@ static int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr, bool direct)
 		if (!check_mmio_spte(vcpu, spte))
 			return RET_PF_INVALID;
 
-		if (direct)
-			addr = 0;
-
 		trace_handle_mmio_page_fault(addr, gfn, access);
 		vcpu_cache_mmio_info(vcpu, addr, gfn, access);
 		return RET_PF_EMULATE;
@@ -6359,7 +6355,7 @@ static int kvm_mmu_write_protect_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 	 * and could put the vCPU into an infinite loop because the processor
 	 * will keep faulting on the non-existent MMIO address.
 	 */
-	if (WARN_ON_ONCE(mmio_info_in_cache(vcpu, cr2_or_gpa, direct)))
+	if (WARN_ON_ONCE(mmio_info_in_cache(vcpu, cr2_or_gpa)))
 		return RET_PF_EMULATE;
 
 	/*
@@ -6421,7 +6417,6 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err
 		       void *insn, int insn_len)
 {
 	int r, emulation_type = EMULTYPE_PF;
-	bool direct = vcpu->arch.mmu->root_role.direct;
 
 	if (WARN_ON_ONCE(!VALID_PAGE(vcpu->arch.mmu->root.hpa)))
 		return RET_PF_RETRY;
@@ -6445,7 +6440,7 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err
 		if (WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS))
 			return -EFAULT;
 
-		r = handle_mmio_page_fault(vcpu, cr2_or_gpa, direct);
+		r = handle_mmio_page_fault(vcpu, cr2_or_gpa);
 		if (r == RET_PF_EMULATE)
 			goto emulate;
 	}
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 38a905fa86de..64da5ba8539b 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -366,6 +366,19 @@ static inline bool is_noncanonical_invlpg_address(u64 la, struct kvm_vcpu *vcpu)
 	return is_noncanonical_address(la, vcpu, X86EMUL_F_INVLPG);
 }
 
+static inline bool vcpu_can_cache_mmio_gva(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * Disable GVA-based caching if TDP is enabled, as GVA is actually a
+	 * GPA or nGPA (if L2 is active), and KVM doesn't track CR3 changes,
+	 * i.e. can't know when to flush the cache.  Similarly, don't track
+	 * GVAs for direct MMUs (CR0.PG=0), as CR2 == GPA, i.e. KVM can simply
+	 * use the GFN cache, so that KVM doesn't have to flush the cache even
+	 * when all other shadow paging update/sync operations can be skipped.
+	 */
+	return !tdp_enabled && is_paging(vcpu);
+}
+
 static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu,
 					gva_t gva, gfn_t gfn, unsigned access)
 {
@@ -374,11 +387,7 @@ static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu,
 	if (unlikely(gen & KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS))
 		return;
 
-	/*
-	 * If this is a shadow nested page table, the "GVA" is
-	 * actually a nGPA.
-	 */
-	vcpu->arch.mmio_gva = mmu_is_nested(vcpu) ? 0 : gva & PAGE_MASK;
+	vcpu->arch.mmio_gva = vcpu_can_cache_mmio_gva(vcpu) ? gva & PAGE_MASK : 0;
 	vcpu->arch.mmio_access = access;
 	vcpu->arch.mmio_gfn = gfn;
 	vcpu->arch.mmio_gen = gen;
@@ -405,8 +414,9 @@ static inline void vcpu_clear_mmio_info(struct kvm_vcpu *vcpu, gva_t gva)
 
 static inline bool vcpu_match_mmio_gva(struct kvm_vcpu *vcpu, unsigned long gva)
 {
-	if (vcpu_match_mmio_gen(vcpu) && vcpu->arch.mmio_gva &&
-	      vcpu->arch.mmio_gva == (gva & PAGE_MASK))
+	if (vcpu_can_cache_mmio_gva(vcpu) &&
+	    vcpu_match_mmio_gen(vcpu) &&
+	    vcpu->arch.mmio_gva && vcpu->arch.mmio_gva == (gva & PAGE_MASK))
 		return true;
 
 	return false;

base-commit: 03e00f8b41e58b5dd5ceef21940a62e5d8e08766
--

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-09 20:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04 16:07 [PATCH 0/3] KVM: harden and cleanup PDPTR load on forced L1 reload Paolo Bonzini
2026-06-04 16:07 ` [PATCH 1/3] KVM: nVMX: unwind PDPTR load if processor triggers a nested VMFail Paolo Bonzini
2026-06-09  3:31   ` Sean Christopherson
2026-06-04 16:07 ` [PATCH 2/3] KVM: MMU: unconditionally clear MMIO cache on root rebuild Paolo Bonzini
2026-06-09 20:24   ` Sean Christopherson
2026-06-04 16:07 ` [PATCH 3/3] KVM: nVMX: remove unnecessary unload on processor-detected VMFail Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox