From: Binbin Wu <binbin.wu@linux.intel.com>
To: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: seanjc@google.com, pbonzini@redhat.com, yan.y.zhao@intel.com,
kai.huang@intel.com, kvm@vger.kernel.org, kas@kernel.org,
linux-kernel@vger.kernel.org, x86@kernel.org,
dave.hansen@intel.com
Subject: Re: [PATCH 16/17] KVM: x86: Move error handling inside free_external_spt()
Date: Thu, 9 Apr 2026 10:08:15 +0800 [thread overview]
Message-ID: <5da1feaa-a376-4586-9593-12eff82e0b3d@linux.intel.com> (raw)
In-Reply-To: <20260327201421.2824383-17-rick.p.edgecombe@intel.com>
On 3/28/2026 4:14 AM, Rick Edgecombe wrote:
> From: Sean Christopherson <seanjc@google.com>
>
> Move the logic for TDX’s specific need to leak pages when reclaim
> fails inside the free_external_spt() op, so this can be done in TDX
> specific code and not the generic MMU.
>
> Do this by passing the SP in instead of the external page table
> pointer. This way TDX code can set sp->external_spt to NULL. Since the
> error is now handled internally, change the op to return void. This way
> it also operated like a normal free in that success is guaranteed from
operated -> operates ?
> the callers perspective.
>
> Opportunistically, drop the unused level arg while adjusting the sp arg.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> [re-wrote log and massaged op name]
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
> Notable changes since last discussion
> - Since free_external_sp() is dropped in the latter DPAMT patches, don't
> bother renaming free_external_spt().
> ---
> arch/x86/include/asm/kvm-x86-ops.h | 2 +-
> arch/x86/include/asm/kvm_host.h | 3 +--
> arch/x86/kvm/mmu/tdp_mmu.c | 13 ++-----------
> arch/x86/kvm/vmx/tdx.c | 25 +++++++++++--------------
> 4 files changed, 15 insertions(+), 28 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index ed348c6dd445..10ccf6ea9d9a 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -96,7 +96,7 @@ KVM_X86_OP_OPTIONAL_RET0(set_identity_map_addr)
> KVM_X86_OP_OPTIONAL_RET0(get_mt_mask)
> KVM_X86_OP(load_mmu_pgd)
> KVM_X86_OP_OPTIONAL_RET0(set_external_spte)
> -KVM_X86_OP_OPTIONAL_RET0(free_external_spt)
> +KVM_X86_OP_OPTIONAL(free_external_spt)
> KVM_X86_OP(has_wbinvd_exit)
> KVM_X86_OP(get_l2_tsc_offset)
> KVM_X86_OP(get_l2_tsc_multiplier)
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 09588e797e4b..fbc39f0bb491 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1881,8 +1881,7 @@ struct kvm_x86_ops {
> u64 new_spte, enum pg_level level);
>
> /* Update external page tables for page table about to be freed. */
> - int (*free_external_spt)(struct kvm *kvm, gfn_t gfn, enum pg_level level,
> - void *external_spt);
> + void (*free_external_spt)(struct kvm *kvm, gfn_t gfn, struct kvm_mmu_page *sp);
>
>
> bool (*has_wbinvd_exit)(void);
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 806788bdecce..575033cc7fe4 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -455,17 +455,8 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
> handle_changed_spte(kvm, sp, gfn, old_spte, FROZEN_SPTE, level, shared);
> }
>
> - if (is_mirror_sp(sp) &&
> - WARN_ON(kvm_x86_call(free_external_spt)(kvm, base_gfn, sp->role.level,
> - sp->external_spt))) {
Nit:
One thing might be worth to mention in the cover letter is that before the change,
if tdx_reclaim_page() return an error code, the warning will be triggered. After
the change, the warning is covered by the TDX_BUG_ON_3(), which is deeper in the
stack. So it's clearer that tdx_reclaim_page() failure is not handled silently.
> - /*
> - * Failed to free page table page in mirror page table and
> - * there is nothing to do further.
> - * Intentionally leak the page to prevent the kernel from
> - * accessing the encrypted page.
> - */
> - sp->external_spt = NULL;
> - }
> + if (is_mirror_sp(sp))
> + kvm_x86_call(free_external_spt)(kvm, base_gfn, sp);
>
> call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
> }
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index bfbadba8bc08..d064b40a6b31 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -1765,27 +1765,24 @@ static void tdx_track(struct kvm *kvm)
> kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE);
> }
>
> -static int tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn,
> - enum pg_level level, void *private_spt)
> +static void tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn,
gfn is also not used in the function right now.
Also, since sp is passed now, the gfn can be got from sp->gfn, should gfn
also be dropped?
> + struct kvm_mmu_page *sp)
> {
> - struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
> -
> /*
> - * free_external_spt() is only called after hkid is freed when TD is
> - * tearing down.
> * KVM doesn't (yet) zap page table pages in mirror page table while
> * TD is active, though guest pages mapped in mirror page table could be
> * zapped during TD is active, e.g. for shared <-> private conversion
> * and slot move/deletion.
> + *
> + * In other words, KVM should only free mirror page tables after the
> + * TD's hkid is freed, when the TD is being torn down.
> + *
> + * If the S-EPT PTE can't be removed for any reason, intentionally leak
> + * the page to prevent the kernel from accessing the encrypted page.
> */
> - if (KVM_BUG_ON(is_hkid_assigned(kvm_tdx), kvm))
> - return -EIO;
> -
> - /*
> - * The HKID assigned to this TD was already freed and cache was
> - * already flushed. We don't have to flush again.
> - */
> - return tdx_reclaim_page(virt_to_page(private_spt));
> + if (KVM_BUG_ON(is_hkid_assigned(to_kvm_tdx(kvm)), kvm) ||
> + tdx_reclaim_page(virt_to_page(sp->external_spt)))
> + sp->external_spt = NULL;
> }
>
> static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn,
next prev parent reply other threads:[~2026-04-09 2:08 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-27 20:14 [PATCH 00/17] TDX MMU refactors Rick Edgecombe
2026-03-27 20:14 ` [PATCH 01/17] x86/tdx: Use pg_level in TDX APIs, not the TDX-Module's 0-based level Rick Edgecombe
2026-03-27 20:14 ` [PATCH 02/17] KVM: x86/mmu: Update iter->old_spte if cmpxchg64 on mirror SPTE "fails" Rick Edgecombe
2026-03-31 9:47 ` Huang, Kai
2026-03-31 9:17 ` Yan Zhao
2026-03-31 9:59 ` Huang, Kai
2026-03-31 9:22 ` Yan Zhao
2026-03-31 10:14 ` Huang, Kai
2026-03-27 20:14 ` [PATCH 03/17] KVM: TDX: Account all non-transient page allocations for per-TD structures Rick Edgecombe
2026-03-27 20:14 ` [PATCH 04/17] KVM: x86: Make "external SPTE" ops that can fail RET0 static calls Rick Edgecombe
2026-03-27 20:14 ` [PATCH 05/17] KVM: x86/tdp_mmu: Drop zapping KVM_BUG_ON() set_external_spte_present() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 06/17] KVM: x86/tdp_mmu: Morph the !is_frozen_spte() check into a KVM_MMU_WARN_ON() Rick Edgecombe
2026-03-30 5:00 ` Yan Zhao
2026-03-31 16:37 ` Edgecombe, Rick P
2026-04-02 1:06 ` Yan Zhao
2026-04-02 19:21 ` Sean Christopherson
2026-04-03 2:47 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 07/17] KVM: x86/tdp_mmu: Centralize updates to present external PTEs Rick Edgecombe
2026-03-30 6:14 ` Yan Zhao
2026-04-01 23:45 ` Edgecombe, Rick P
2026-04-02 1:59 ` Yan Zhao
2026-04-02 23:10 ` Edgecombe, Rick P
2026-04-02 23:28 ` Sean Christopherson
2026-04-03 9:05 ` Yan Zhao
2026-04-04 0:15 ` Edgecombe, Rick P
2026-04-07 8:34 ` Yan Zhao
2026-04-07 17:21 ` Edgecombe, Rick P
2026-04-08 1:23 ` Yan Zhao
2026-04-03 9:08 ` Yan Zhao
2026-03-31 10:09 ` Huang, Kai
2026-04-01 23:58 ` Edgecombe, Rick P
2026-04-02 23:21 ` Sean Christopherson
2026-04-01 8:34 ` Yan Zhao
2026-04-02 23:46 ` Edgecombe, Rick P
2026-04-03 10:33 ` Yan Zhao
2026-04-08 1:50 ` Yan Zhao
2026-04-08 10:47 ` Binbin Wu
2026-03-27 20:14 ` [PATCH 08/17] KVM: TDX: Drop kvm_x86_ops.link_external_spt(), use .set_external_spte() for all Rick Edgecombe
2026-03-30 6:28 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 09/17] KVM: TDX: Add helper to handle mapping leaf SPTE into S-EPT Rick Edgecombe
2026-03-30 6:43 ` Yan Zhao
2026-04-01 23:59 ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 10/17] KVM: TDX: Move set_external_spte_present() assert into TDX code Rick Edgecombe
2026-03-31 10:30 ` Huang, Kai
2026-04-02 0:00 ` Edgecombe, Rick P
2026-03-31 10:34 ` Huang, Kai
2026-03-27 20:14 ` [PATCH 11/17] KVM: x86/mmu: Fold set_external_spte_present() into its sole caller Rick Edgecombe
2026-03-31 10:36 ` Huang, Kai
2026-04-01 7:41 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 12/17] KVM: x86/mmu: Plumb the old_spte into kvm_x86_ops.set_external_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 13/17] KVM: TDX: Hoist tdx_sept_remove_private_spte() above set_private_spte() Rick Edgecombe
2026-03-31 10:42 ` Huang, Kai
2026-04-02 0:04 ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 14/17] KVM: x86/mmu: Remove KVM_BUG_ON() that checks lock when removing PTs Rick Edgecombe
2026-03-30 7:01 ` Yan Zhao
2026-03-31 10:46 ` Huang, Kai
2026-04-02 0:08 ` Edgecombe, Rick P
2026-04-02 2:04 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 15/17] KVM: TDX: Handle removal of leaf SPTEs in .set_private_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 16/17] KVM: x86: Move error handling inside free_external_spt() Rick Edgecombe
2026-04-09 2:08 ` Binbin Wu [this message]
2026-03-27 20:14 ` [PATCH 17/17] KVM: TDX: Move external page table freeing to TDX code Rick Edgecombe
2026-03-30 7:49 ` Yan Zhao
2026-04-02 0:17 ` Edgecombe, Rick P
2026-04-02 2:16 ` Yan Zhao
2026-04-02 2:17 ` Yan Zhao
2026-03-31 11:02 ` Huang, Kai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5da1feaa-a376-4586-9593-12eff82e0b3d@linux.intel.com \
--to=binbin.wu@linux.intel.com \
--cc=dave.hansen@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=x86@kernel.org \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.