public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: x86: Fix shadow paging use-after-free due to unexpected GFN
@ 2026-05-03 20:10 Paolo Bonzini
  2026-05-05 10:13 ` stable backports for "KVM: x86: Fix shadow paging use-after-free due to unexpected GFN" Paolo Bonzini
  0 siblings, 1 reply; 2+ messages in thread
From: Paolo Bonzini @ 2026-05-03 20:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Sean Christopherson, Alexander Bulekov, Fred Griffoul, stable

From: Sean Christopherson <seanjc@google.com>

The shadow MMU computes GFNs for direct shadow pages using sp->gfn plus
the SPTE index. This assumption breaks for shadow paging if the guest
page tables are modified between VM entries (similar to commit
aad885e77496, "KVM: x86/mmu: Drop/zap existing present SPTE even
when creating an MMIO SPTE", 2026-03-27).  The flow is as follows:

- a PDE is installed for a 2MB mapping, and a page in that area is
  accessed.  KVM creates a kvm_mmu_page consisting of 512 4KB pages;
  the kvm_mmu_page is marked by FNAME(fetch) as direct-mapped because
  the guest's mapping is a huge page (and thus contiguous).

- the PDE mapping is changed from outside the guest.

- the guest accesses another page in the same 2MB area.  KVM installs
  a new leaf SPTE and rmap entry; the SPTE uses the "correct" GFN
  (i.e. based on the new mapping, as changed in the previous step) but
  that GFN is outside of the [sp->gfn, sp->gfn + 511] range; therefore
  the rmap entry cannot be found and removed when the kvm_mmu_page
  is zapped.

- the memslot that covers the first 2MB mapping is deleted, and the
  kvm_mmu_page for the now-invalid GPA is zapped.  However, rmap_remove()
  only looks at the [sp->gfn, sp->gfn + 511] range established in step 1,
  and fails to find the rmap entry that was recorded by step 3.

- any operation that causes an rmap walk for the same page accessed
  by step 3 then walks a stale rmap and dereferences a freed kvm_mmu_age.
  This includes dirty logging or MMU notifier invalidations (e.g., from
  MADV_DONTNEED).

The underlying issue is that KVM's walking of shadow PTEs assumes that
if a SPTE is present when KVM wants to install a non-leaf SPTE, then the
existing kvm_mmu_page must be for the correct gfn.  Because the only way
for the gfn to be wrong is if KVM messed up and failed to zap a SPTE...
which shouldn't happen, but *actually* only happens in response to a
guest write.

That bug dates back literally forever, as even the first version of KVM
assumes that the GFN matches and walks into the "wrong" shadow page.
However, that was only an imprecision until 2032a93d66fa ("KVM: MMU:
Don't allocate gfns page for direct mmu pages") came along.

Fix it by checking for a target gfn mismatch and zapping the existing
SPTE.  That way the old SP and rmap entries are gone, KVM installs
the rmap in the right location, and everyone is happy.

Fixes: 2032a93d66fa ("KVM: MMU: Don't allocate gfns page for direct mmu pages")
Fixes: 6aa8b732ca01 ("kvm: userspace interface")
Reported-by: Alexander Bulekov <bkov@amazon.com>
Reported-by: Fred Griffoul <fgriffo@amazon.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 35 ++++++++++++++---------------------
 1 file changed, 14 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 24fbc9ea502a..892246204435 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -182,6 +182,8 @@ static struct kmem_cache *pte_list_desc_cache;
 struct kmem_cache *mmu_page_header_cache;
 
 static void mmu_spte_set(u64 *sptep, u64 spte);
+static int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp,
+			    u64 *spte, struct list_head *invalid_list);
 
 struct kvm_mmu_role_regs {
 	const unsigned long cr0;
@@ -1287,19 +1289,6 @@ static void drop_spte(struct kvm *kvm, u64 *sptep)
 		rmap_remove(kvm, sptep);
 }
 
-static void drop_large_spte(struct kvm *kvm, u64 *sptep, bool flush)
-{
-	struct kvm_mmu_page *sp;
-
-	sp = sptep_to_sp(sptep);
-	WARN_ON_ONCE(sp->role.level == PG_LEVEL_4K);
-
-	drop_spte(kvm, sptep);
-
-	if (flush)
-		kvm_flush_remote_tlbs_sptep(kvm, sptep);
-}
-
 /*
  * Write-protect on the specified @sptep, @pt_protect indicates whether
  * spte write-protection is caused by protecting shadow page table.
@@ -2466,7 +2455,8 @@ static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu,
 {
 	union kvm_mmu_page_role role;
 
-	if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep))
+	if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep) &&
+	    spte_to_child_sp(*sptep) && spte_to_child_sp(*sptep)->gfn == gfn)
 		return ERR_PTR(-EEXIST);
 
 	role = kvm_mmu_child_role(sptep, direct, access);
@@ -2544,13 +2534,16 @@ static void __link_shadow_page(struct kvm *kvm,
 
 	BUILD_BUG_ON(VMX_EPT_WRITABLE_MASK != PT_WRITABLE_MASK);
 
-	/*
-	 * If an SPTE is present already, it must be a leaf and therefore
-	 * a large one.  Drop it, and flush the TLB if needed, before
-	 * installing sp.
-	 */
-	if (is_shadow_present_pte(*sptep))
-		drop_large_spte(kvm, sptep, flush);
+	if (is_shadow_present_pte(*sptep)) {
+		struct kvm_mmu_page *parent_sp;
+		LIST_HEAD(invalid_list);
+
+		parent_sp = sptep_to_sp(sptep);
+		WARN_ON_ONCE(parent_sp->role.level == PG_LEVEL_4K);
+
+		mmu_page_zap_pte(kvm, parent_sp, sptep, &invalid_list);
+		kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, true);
+	}
 
 	spte = make_nonleaf_spte(sp->spt, sp_ad_disabled(sp));
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* stable backports for "KVM: x86: Fix shadow paging use-after-free due to unexpected GFN"
  2026-05-03 20:10 [PATCH] KVM: x86: Fix shadow paging use-after-free due to unexpected GFN Paolo Bonzini
@ 2026-05-05 10:13 ` Paolo Bonzini
  0 siblings, 0 replies; 2+ messages in thread
From: Paolo Bonzini @ 2026-05-05 10:13 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Sean Christopherson, Alexander Bulekov, Fred Griffoul, stable

I have started sending out backports for stable kernels up to 6.1.  For 
5.10 and 5.15 I have identified the required patches and backported 
them, but I haven't tested them yet.

I'll get to testing and sending them out, but it will take a while; if 
anybody wants to help testing, I can provide my tentative patches.

This is the list for 5.15:

     27a59d57f073 KVM: x86/mmu: Use a bool for direct
     86938ab6925b KVM: x86/mmu: Stop passing "direct" to mmu_alloc_root()
     2e65e842c57d KVM: x86/mmu: Derive shadow MMU page role from parent
     7f49777550e5 KVM: x86/mmu: Always pass 0 for @quadrant when gptes
                  are 8 bytes
     0cd8dc739833 KVM: x86/mmu: pull call to drop_large_spte() into
                  __link_shadow_page()
     0cb2af2ea66a KVM: x86: Fix shadow paging use-after-free due to
                  unexpected GFN


and the longer one for 5.10:

     b37233c911cb KVM: x86/mmu: Capture 'mmu' in a local variable when
                  allocating roots
     ba0a194ffbfb KVM: x86/mmu: Allocate the lm_root before allocating
                  PAE roots
     748e52b9b736 KVM: x86/mmu: Allocate pae_root and lm_root pages in
                  dedicated helper
     6e6ec5848574 KVM: x86/mmu: Ensure MMU pages are available when
                  allocating roots
     27a59d57f073 KVM: x86/mmu: Use a bool for direct
     86938ab6925b KVM: x86/mmu: Stop passing "direct" to mmu_alloc_root()
     03fffc5493c8 KVM: x86/mmu: Refactor shadow walk in __direct_map() to
                  reduce indentation
     f81602958c11 KVM: X86: Fix missed remote tlb flush in
                  rmap_write_protect()
     65855ed8b034 KVM: X86: Synchronize the shadow pagetable before link
                  it
     2e65e842c57d KVM: x86/mmu: Derive shadow MMU page role from parent
     7f49777550e5 KVM: x86/mmu: Always pass 0 for @quadrant when gptes
                  are 8 bytes
     6e0918aec49a KVM: x86/mmu: Check PDPTRs before allocating PAE roots
     0cd8dc739833 KVM: x86/mmu: pull call to drop_large_spte() into
                  __link_shadow_page()
     0cb2af2ea66a KVM: x86: Fix shadow paging use-after-free due to
                  unexpected GFN

Paolo

On Sun, May 3, 2026 at 10:10 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> From: Sean Christopherson <seanjc@google.com>
>
> The shadow MMU computes GFNs for direct shadow pages using sp->gfn plus
> the SPTE index. This assumption breaks for shadow paging if the guest
> page tables are modified between VM entries (similar to commit
> aad885e77496, "KVM: x86/mmu: Drop/zap existing present SPTE even
> when creating an MMIO SPTE", 2026-03-27).  The flow is as follows:
>
> - a PDE is installed for a 2MB mapping, and a page in that area is
>   accessed.  KVM creates a kvm_mmu_page consisting of 512 4KB pages;
>   the kvm_mmu_page is marked by FNAME(fetch) as direct-mapped because
>   the guest's mapping is a huge page (and thus contiguous).
>
> - the PDE mapping is changed from outside the guest.
>
> - the guest accesses another page in the same 2MB area.  KVM installs
>   a new leaf SPTE and rmap entry; the SPTE uses the "correct" GFN
>   (i.e. based on the new mapping, as changed in the previous step) but
>   that GFN is outside of the [sp->gfn, sp->gfn + 511] range; therefore
>   the rmap entry cannot be found and removed when the kvm_mmu_page
>   is zapped.
>
> - the memslot that covers the first 2MB mapping is deleted, and the
>   kvm_mmu_page for the now-invalid GPA is zapped.  However, rmap_remove()
>   only looks at the [sp->gfn, sp->gfn + 511] range established in step 1,
>   and fails to find the rmap entry that was recorded by step 3.
>
> - any operation that causes an rmap walk for the same page accessed
>   by step 3 then walks a stale rmap and dereferences a freed kvm_mmu_age.
>   This includes dirty logging or MMU notifier invalidations (e.g., from
>   MADV_DONTNEED).
>
> The underlying issue is that KVM's walking of shadow PTEs assumes that
> if a SPTE is present when KVM wants to install a non-leaf SPTE, then the
> existing kvm_mmu_page must be for the correct gfn.  Because the only way
> for the gfn to be wrong is if KVM messed up and failed to zap a SPTE...
> which shouldn't happen, but *actually* only happens in response to a
> guest write.
>
> That bug dates back literally forever, as even the first version of KVM
> assumes that the GFN matches and walks into the "wrong" shadow page.
> However, that was only an imprecision until 2032a93d66fa ("KVM: MMU:
> Don't allocate gfns page for direct mmu pages") came along.
>
> Fix it by checking for a target gfn mismatch and zapping the existing
> SPTE.  That way the old SP and rmap entries are gone, KVM installs
> the rmap in the right location, and everyone is happy.
>
> Fixes: 2032a93d66fa ("KVM: MMU: Don't allocate gfns page for direct mmu pages")
> Fixes: 6aa8b732ca01 ("kvm: userspace interface")
> Reported-by: Alexander Bulekov <bkov@amazon.com>
> Reported-by: Fred Griffoul <fgriffo@amazon.co.uk>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 35 ++++++++++++++---------------------
>  1 file changed, 14 insertions(+), 21 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 24fbc9ea502a..892246204435 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -182,6 +182,8 @@ static struct kmem_cache *pte_list_desc_cache;
>  struct kmem_cache *mmu_page_header_cache;
>
>  static void mmu_spte_set(u64 *sptep, u64 spte);
> +static int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp,
> +                           u64 *spte, struct list_head *invalid_list);
>
>  struct kvm_mmu_role_regs {
>         const unsigned long cr0;
> @@ -1287,19 +1289,6 @@ static void drop_spte(struct kvm *kvm, u64 *sptep)
>                 rmap_remove(kvm, sptep);
>  }
>
> -static void drop_large_spte(struct kvm *kvm, u64 *sptep, bool flush)
> -{
> -       struct kvm_mmu_page *sp;
> -
> -       sp = sptep_to_sp(sptep);
> -       WARN_ON_ONCE(sp->role.level == PG_LEVEL_4K);
> -
> -       drop_spte(kvm, sptep);
> -
> -       if (flush)
> -               kvm_flush_remote_tlbs_sptep(kvm, sptep);
> -}
> -
>  /*
>   * Write-protect on the specified @sptep, @pt_protect indicates whether
>   * spte write-protection is caused by protecting shadow page table.
> @@ -2466,7 +2455,8 @@ static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu,
>  {
>         union kvm_mmu_page_role role;
>
> -       if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep))
> +       if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep) &&
> +           spte_to_child_sp(*sptep) && spte_to_child_sp(*sptep)->gfn == gfn)
>                 return ERR_PTR(-EEXIST);
>
>         role = kvm_mmu_child_role(sptep, direct, access);
> @@ -2544,13 +2534,16 @@ static void __link_shadow_page(struct kvm *kvm,
>
>         BUILD_BUG_ON(VMX_EPT_WRITABLE_MASK != PT_WRITABLE_MASK);
>
> -       /*
> -        * If an SPTE is present already, it must be a leaf and therefore
> -        * a large one.  Drop it, and flush the TLB if needed, before
> -        * installing sp.
> -        */
> -       if (is_shadow_present_pte(*sptep))
> -               drop_large_spte(kvm, sptep, flush);
> +       if (is_shadow_present_pte(*sptep)) {
> +               struct kvm_mmu_page *parent_sp;
> +               LIST_HEAD(invalid_list);
> +
> +               parent_sp = sptep_to_sp(sptep);
> +               WARN_ON_ONCE(parent_sp->role.level == PG_LEVEL_4K);
> +
> +               mmu_page_zap_pte(kvm, parent_sp, sptep, &invalid_list);
> +               kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, true);
> +       }
>
>         spte = make_nonleaf_spte(sp->spt, sp_ad_disabled(sp));
>
> --
> 2.54.0


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-05-05 10:14 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-03 20:10 [PATCH] KVM: x86: Fix shadow paging use-after-free due to unexpected GFN Paolo Bonzini
2026-05-05 10:13 ` stable backports for "KVM: x86: Fix shadow paging use-after-free due to unexpected GFN" Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox