Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: Yan Zhao <yan.y.zhao@intel.com>
To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org,
	rick.p.edgecombe@intel.com, kas@kernel.org
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	dave.hansen@intel.com, kai.huang@intel.com,
	binbin.wu@linux.intel.com, xiaoyao.li@intel.com,
	yan.y.zhao@intel.com
Subject: [PATCH v2 15/15] KVM: TDX: Move external page table freeing to TDX code
Date: Sat,  9 May 2026 15:57:40 +0800	[thread overview]
Message-ID: <20260509075740.4371-1-yan.y.zhao@intel.com> (raw)
In-Reply-To: <20260509075201.4077-1-yan.y.zhao@intel.com>

From: Sean Christopherson <seanjc@google.com>

Move the freeing of external page tables into the reclaim operation that
lives in TDX code.

The TDP MMU supports traversing the TDP without holding locks. Page tables
need to be freed via RCU to prevent walking one that gets freed.

While none of these lockless walk operations actually happen for the mirror
page table, the TDP MMU nonetheless frees the mirror page table in the same
way, and (because it's a handy place to plug it in) the external page table
as well.

However, the external page table definitely can't be walked once the page
table pages are reclaimed from the TDX module. The TDX module releases the
page for the host VMM to use, so this RCU-time free is unnecessary for the
external page table.

So move the free_page() call to TDX code. Create an
tdp_mmu_free_unused_sp() to allow for freeing external page tables that
have never left the TDP MMU code (i.e. don't need freed in a special way).

Link: https://lore.kernel.org/kvm/aYpjNrtGmogNzqwT@google.com/
Not-yet-Signed-off-by: Sean Christopherson <seanjc@google.com>
[Based on a diff by Sean, added log]
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
MMU_refactors v2:
- Fixed typos in the patch log. (Yan, Kai)
- Still kept "Not-yet-Signed-off-by" tag. Sean, please change it to SoB if
  the patch looks good to you.
- Updated the code comment in tdx_sept_free_private_spt(): invoking
  free_page() to free S-EPT page in tdx_sept_free_private_spt() is only
  because RCU-time free is unnecessary, not because it can't be performed
  from RCU callbacks. (Yan)
---
 arch/x86/kvm/mmu/tdp_mmu.c | 16 +++++++++++-----
 arch/x86/kvm/vmx/tdx.c     | 11 ++++++++++-
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index a847a8f09bc6..bb18e9e61542 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -53,13 +53,18 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
 	rcu_barrier();
 }
 
-static void tdp_mmu_free_sp(struct kvm_mmu_page *sp)
+static void __tdp_mmu_free_sp(struct kvm_mmu_page *sp)
 {
-	free_page((unsigned long)sp->external_spt);
 	free_page((unsigned long)sp->spt);
 	kmem_cache_free(mmu_page_header_cache, sp);
 }
 
+static void tdp_mmu_free_unused_sp(struct kvm_mmu_page *sp)
+{
+	free_page((unsigned long)sp->external_spt);
+	__tdp_mmu_free_sp(sp);
+}
+
 /*
  * This is called through call_rcu in order to free TDP page table memory
  * safely with respect to other kernel threads that may be operating on
@@ -73,7 +78,8 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
 	struct kvm_mmu_page *sp = container_of(head, struct kvm_mmu_page,
 					       rcu_head);
 
-	tdp_mmu_free_sp(sp);
+	WARN_ON_ONCE(sp->external_spt);
+	__tdp_mmu_free_sp(sp);
 }
 
 void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root)
@@ -1266,7 +1272,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 		 * failed, e.g. because a different task modified the SPTE.
 		 */
 		if (r) {
-			tdp_mmu_free_sp(sp);
+			tdp_mmu_free_unused_sp(sp);
 			goto retry;
 		}
 
@@ -1577,7 +1583,7 @@ static int tdp_mmu_split_huge_pages_root(struct kvm *kvm,
 	 * installs its own sp in place of the last sp we tried to split.
 	 */
 	if (sp)
-		tdp_mmu_free_sp(sp);
+		tdp_mmu_free_unused_sp(sp);
 
 	return 0;
 }
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 9431bc443d50..2539107e0ad3 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1869,7 +1869,16 @@ static void tdx_sept_free_private_spt(struct kvm *kvm, struct kvm_mmu_page *sp)
 	 */
 	if (KVM_BUG_ON(is_hkid_assigned(to_kvm_tdx(kvm)), kvm) ||
 	    tdx_reclaim_page(virt_to_page(sp->external_spt)))
-		sp->external_spt = NULL;
+		goto out;
+
+	/*
+	 * Immediately free the S-EPT page because RCU-time free is unnecessary
+	 * after TDH.PHYMEM.PAGE.RECLAIM ensures there are no outstanding
+	 * readers.
+	 */
+	free_page((unsigned long)sp->external_spt);
+out:
+	sp->external_spt = NULL;
 }
 
 void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode,
-- 
2.43.2


      parent reply	other threads:[~2026-05-09  8:37 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-09  7:52 [PATCH v2 00/15] TDX MMU refactors Yan Zhao
2026-05-09  7:53 ` [PATCH v2 01/15] KVM: TDX: Drop kvm_x86_ops.link_external_spt() Yan Zhao
2026-05-09  7:55 ` [PATCH v2 02/15] KVM: TDX: Wrap mapping of leaf and non-leaf S-EPT entries into helpers Yan Zhao
2026-05-09  7:55 ` [PATCH v2 03/15] KVM: x86/mmu: Fold set_external_spte_present() into its sole caller Yan Zhao
2026-05-09  7:55 ` [PATCH v2 04/15] KVM: x86/mmu: Plumb param "old_spte" into kvm_x86_ops.set_external_spte() Yan Zhao
2026-05-09  7:55 ` [PATCH v2 05/15] KVM: TDX: Move KVM_BUG_ON()s in __tdp_mmu_set_spte_atomic() to TDX code Yan Zhao
2026-05-09  7:55 ` [PATCH v2 06/15] KVM: TDX: Move lockdep assert " Yan Zhao
2026-05-09  7:56 ` [PATCH v2 07/15] KVM: x86/tdp_mmu: Morph !is_frozen_spte() check into a KVM_MMU_WARN_ON() Yan Zhao
2026-05-09  7:56 ` [PATCH v2 08/15] KVM: x86/mmu: Plumb "sp" _pointer_ into the TDP MMU's handle_changed_spte() Yan Zhao
2026-05-09  7:56 ` [PATCH v2 09/15] KVM: x86/tdp_mmu: Centrally propagate to-present/atomic zap updates to external PTEs Yan Zhao
2026-05-09  7:56 ` [PATCH v2 10/15] KVM: x86/mmu: Drop KVM_BUG_ON() on shared lock to zap child " Yan Zhao
2026-05-09  7:56 ` [PATCH v2 11/15] KVM: TDX: Hoist tdx_sept_remove_private_spte() above set_private_spte() Yan Zhao
2026-05-09  7:57 ` [PATCH v2 12/15] KVM: TDX: Drop kvm_x86_ops.remove_external_spte() Yan Zhao
2026-05-09  7:57 ` [PATCH v2 13/15] KVM: TDX: Rename tdx_sept_remove_private_spte() to show it's for leaf SPTEs Yan Zhao
2026-05-09  7:57 ` [PATCH v2 14/15] KVM: x86: Move error handling inside free_external_spt() Yan Zhao
2026-05-09  7:57 ` Yan Zhao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260509075740.4371-1-yan.y.zhao@intel.com \
    --to=yan.y.zhao@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=dave.hansen@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox