From: Rick Edgecombe <rick.p.edgecombe@intel.com>
To: seanjc@google.com, pbonzini@redhat.com, yan.y.zhao@intel.com,
kai.huang@intel.com, kvm@vger.kernel.org, kas@kernel.org
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
dave.hansen@intel.com, rick.p.edgecombe@intel.com
Subject: [PATCH 17/17] KVM: TDX: Move external page table freeing to TDX code
Date: Fri, 27 Mar 2026 13:14:21 -0700 [thread overview]
Message-ID: <20260327201421.2824383-18-rick.p.edgecombe@intel.com> (raw)
In-Reply-To: <20260327201421.2824383-1-rick.p.edgecombe@intel.com>
From: Sean Christopherson <seanjc@google.com>
Move the freeing of external page tables into the reclaim operation that
lives in TDX code.
The TDP MMU supports traversing the TDP without holding locks. Page
tables needs to be freed via RCU to prevent walking one that gets
freed.
While none of these lockless walk operations actually happen for the
mirror EPT, the TDP MMU none-the-less frees the mirror EPT page tables in
the same way, and because it’s a handy place to plug it in, the external
page tables as well.
However, the external page tables definitely can’t be walked once they are
reclaimed from the TDX module. The TDX module releases the page for the
host VMM to use, so this RCU-time free is unnecessary for external page
tables.
So move the free_page() call to TDX code. Create an
tdp_mmu_free_unused_sp() to allow for freeing external page tables that
have never left the TDP MMU code (i.e. don’t need freed in a special way.
Link: https://lore.kernel.org/kvm/aYpjNrtGmogNzqwT@google.com/
Not-yet-Signed-off-by: Sean Christopherson <seanjc@google.com>
[Based on a diff by Sean, added log]
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
arch/x86/kvm/mmu/tdp_mmu.c | 16 +++++++++++-----
arch/x86/kvm/vmx/tdx.c | 11 ++++++++++-
2 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 575033cc7fe4..18e11c1c7631 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -53,13 +53,18 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
rcu_barrier();
}
-static void tdp_mmu_free_sp(struct kvm_mmu_page *sp)
+static void __tdp_mmu_free_sp(struct kvm_mmu_page *sp)
{
- free_page((unsigned long)sp->external_spt);
free_page((unsigned long)sp->spt);
kmem_cache_free(mmu_page_header_cache, sp);
}
+static void tdp_mmu_free_unused_sp(struct kvm_mmu_page *sp)
+{
+ free_page((unsigned long)sp->external_spt);
+ __tdp_mmu_free_sp(sp);
+}
+
/*
* This is called through call_rcu in order to free TDP page table memory
* safely with respect to other kernel threads that may be operating on
@@ -73,7 +78,8 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
struct kvm_mmu_page *sp = container_of(head, struct kvm_mmu_page,
rcu_head);
- tdp_mmu_free_sp(sp);
+ WARN_ON_ONCE(sp->external_spt);
+ __tdp_mmu_free_sp(sp);
}
void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root)
@@ -1261,7 +1267,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
* failed, e.g. because a different task modified the SPTE.
*/
if (r) {
- tdp_mmu_free_sp(sp);
+ tdp_mmu_free_unused_sp(sp);
goto retry;
}
@@ -1571,7 +1577,7 @@ static int tdp_mmu_split_huge_pages_root(struct kvm *kvm,
* installs its own sp in place of the last sp we tried to split.
*/
if (sp)
- tdp_mmu_free_sp(sp);
+ tdp_mmu_free_unused_sp(sp);
return 0;
}
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index d064b40a6b31..1346e891ca94 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1782,7 +1782,16 @@ static void tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn,
*/
if (KVM_BUG_ON(is_hkid_assigned(to_kvm_tdx(kvm)), kvm) ||
tdx_reclaim_page(virt_to_page(sp->external_spt)))
- sp->external_spt = NULL;
+ goto out;
+
+ /*
+ * Immediately free the S-EPT page as the TDX subsystem doesn't support
+ * freeing pages from RCU callbacks, and more importantly because
+ * TDH.PHYMEM.PAGE.RECLAIM ensures there are no outstanding readers.
+ */
+ free_page((unsigned long)sp->external_spt);
+out:
+ sp->external_spt = NULL;
}
static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn,
--
2.53.0
next prev parent reply other threads:[~2026-03-27 20:14 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-27 20:14 [PATCH 00/17] TDX MMU refactors Rick Edgecombe
2026-03-27 20:14 ` [PATCH 01/17] x86/tdx: Use pg_level in TDX APIs, not the TDX-Module's 0-based level Rick Edgecombe
2026-03-27 20:14 ` [PATCH 02/17] KVM: x86/mmu: Update iter->old_spte if cmpxchg64 on mirror SPTE "fails" Rick Edgecombe
2026-03-31 9:47 ` Huang, Kai
2026-03-31 9:17 ` Yan Zhao
2026-03-31 9:59 ` Huang, Kai
2026-03-31 9:22 ` Yan Zhao
2026-03-31 10:14 ` Huang, Kai
2026-03-27 20:14 ` [PATCH 03/17] KVM: TDX: Account all non-transient page allocations for per-TD structures Rick Edgecombe
2026-03-27 20:14 ` [PATCH 04/17] KVM: x86: Make "external SPTE" ops that can fail RET0 static calls Rick Edgecombe
2026-03-27 20:14 ` [PATCH 05/17] KVM: x86/tdp_mmu: Drop zapping KVM_BUG_ON() set_external_spte_present() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 06/17] KVM: x86/tdp_mmu: Morph the !is_frozen_spte() check into a KVM_MMU_WARN_ON() Rick Edgecombe
2026-03-30 5:00 ` Yan Zhao
2026-03-31 16:37 ` Edgecombe, Rick P
2026-04-02 1:06 ` Yan Zhao
2026-04-02 19:21 ` Sean Christopherson
2026-04-03 2:47 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 07/17] KVM: x86/tdp_mmu: Centralize updates to present external PTEs Rick Edgecombe
2026-03-30 6:14 ` Yan Zhao
2026-04-01 23:45 ` Edgecombe, Rick P
2026-04-02 1:59 ` Yan Zhao
2026-04-02 23:10 ` Edgecombe, Rick P
2026-04-02 23:28 ` Sean Christopherson
2026-04-03 9:05 ` Yan Zhao
2026-04-04 0:15 ` Edgecombe, Rick P
2026-04-07 8:34 ` Yan Zhao
2026-04-07 17:21 ` Edgecombe, Rick P
2026-04-08 1:23 ` Yan Zhao
2026-04-03 9:08 ` Yan Zhao
2026-03-31 10:09 ` Huang, Kai
2026-04-01 23:58 ` Edgecombe, Rick P
2026-04-02 23:21 ` Sean Christopherson
2026-04-01 8:34 ` Yan Zhao
2026-04-02 23:46 ` Edgecombe, Rick P
2026-04-03 10:33 ` Yan Zhao
2026-04-08 1:50 ` Yan Zhao
2026-04-08 10:47 ` Binbin Wu
2026-03-27 20:14 ` [PATCH 08/17] KVM: TDX: Drop kvm_x86_ops.link_external_spt(), use .set_external_spte() for all Rick Edgecombe
2026-03-30 6:28 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 09/17] KVM: TDX: Add helper to handle mapping leaf SPTE into S-EPT Rick Edgecombe
2026-03-30 6:43 ` Yan Zhao
2026-04-01 23:59 ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 10/17] KVM: TDX: Move set_external_spte_present() assert into TDX code Rick Edgecombe
2026-03-31 10:30 ` Huang, Kai
2026-04-02 0:00 ` Edgecombe, Rick P
2026-03-31 10:34 ` Huang, Kai
2026-03-27 20:14 ` [PATCH 11/17] KVM: x86/mmu: Fold set_external_spte_present() into its sole caller Rick Edgecombe
2026-03-31 10:36 ` Huang, Kai
2026-04-01 7:41 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 12/17] KVM: x86/mmu: Plumb the old_spte into kvm_x86_ops.set_external_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 13/17] KVM: TDX: Hoist tdx_sept_remove_private_spte() above set_private_spte() Rick Edgecombe
2026-03-31 10:42 ` Huang, Kai
2026-04-02 0:04 ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 14/17] KVM: x86/mmu: Remove KVM_BUG_ON() that checks lock when removing PTs Rick Edgecombe
2026-03-30 7:01 ` Yan Zhao
2026-03-31 10:46 ` Huang, Kai
2026-04-02 0:08 ` Edgecombe, Rick P
2026-04-02 2:04 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 15/17] KVM: TDX: Handle removal of leaf SPTEs in .set_private_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 16/17] KVM: x86: Move error handling inside free_external_spt() Rick Edgecombe
2026-04-09 2:08 ` Binbin Wu
2026-03-27 20:14 ` Rick Edgecombe [this message]
2026-03-30 7:49 ` [PATCH 17/17] KVM: TDX: Move external page table freeing to TDX code Yan Zhao
2026-04-02 0:17 ` Edgecombe, Rick P
2026-04-02 2:16 ` Yan Zhao
2026-04-02 2:17 ` Yan Zhao
2026-03-31 11:02 ` Huang, Kai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260327201421.2824383-18-rick.p.edgecombe@intel.com \
--to=rick.p.edgecombe@intel.com \
--cc=dave.hansen@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=x86@kernel.org \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.