public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Binbin Wu <binbin.wu@linux.intel.com>
To: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: seanjc@google.com, pbonzini@redhat.com, yan.y.zhao@intel.com,
	kai.huang@intel.com, kvm@vger.kernel.org, kas@kernel.org,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	dave.hansen@intel.com
Subject: Re: [PATCH 16/17] KVM: x86: Move error handling inside free_external_spt()
Date: Thu, 9 Apr 2026 10:08:15 +0800	[thread overview]
Message-ID: <5da1feaa-a376-4586-9593-12eff82e0b3d@linux.intel.com> (raw)
In-Reply-To: <20260327201421.2824383-17-rick.p.edgecombe@intel.com>



On 3/28/2026 4:14 AM, Rick Edgecombe wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> Move the logic for TDX’s specific need to leak pages when reclaim
> fails inside the free_external_spt() op, so this can be done in TDX
> specific code and not the generic MMU.
> 
> Do this by passing the SP in instead of the external page table
> pointer. This way TDX code can set sp->external_spt to NULL. Since the
> error is now handled internally, change the op to return void. This way
> it also operated like a normal free in that success is guaranteed from

operated -> operates ?

> the callers perspective.
> 
> Opportunistically, drop the unused level arg while adjusting the sp arg.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> [re-wrote log and massaged op name]
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
> Notable changes since last discussion
>  - Since free_external_sp() is dropped in the latter DPAMT patches, don't
>    bother renaming free_external_spt().
> ---
>  arch/x86/include/asm/kvm-x86-ops.h |  2 +-
>  arch/x86/include/asm/kvm_host.h    |  3 +--
>  arch/x86/kvm/mmu/tdp_mmu.c         | 13 ++-----------
>  arch/x86/kvm/vmx/tdx.c             | 25 +++++++++++--------------
>  4 files changed, 15 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index ed348c6dd445..10ccf6ea9d9a 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -96,7 +96,7 @@ KVM_X86_OP_OPTIONAL_RET0(set_identity_map_addr)
>  KVM_X86_OP_OPTIONAL_RET0(get_mt_mask)
>  KVM_X86_OP(load_mmu_pgd)
>  KVM_X86_OP_OPTIONAL_RET0(set_external_spte)
> -KVM_X86_OP_OPTIONAL_RET0(free_external_spt)
> +KVM_X86_OP_OPTIONAL(free_external_spt)
>  KVM_X86_OP(has_wbinvd_exit)
>  KVM_X86_OP(get_l2_tsc_offset)
>  KVM_X86_OP(get_l2_tsc_multiplier)
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 09588e797e4b..fbc39f0bb491 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1881,8 +1881,7 @@ struct kvm_x86_ops {
>  				 u64 new_spte, enum pg_level level);
>  
>  	/* Update external page tables for page table about to be freed. */
> -	int (*free_external_spt)(struct kvm *kvm, gfn_t gfn, enum pg_level level,
> -				 void *external_spt);
> +	void (*free_external_spt)(struct kvm *kvm, gfn_t gfn, struct kvm_mmu_page *sp);
>  
>  
>  	bool (*has_wbinvd_exit)(void);
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 806788bdecce..575033cc7fe4 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -455,17 +455,8 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
>  		handle_changed_spte(kvm, sp, gfn, old_spte, FROZEN_SPTE, level, shared);
>  	}
>  
> -	if (is_mirror_sp(sp) &&
> -	    WARN_ON(kvm_x86_call(free_external_spt)(kvm, base_gfn, sp->role.level,
> -						    sp->external_spt))) {

Nit:
One thing might be worth to mention in the cover letter is that before the change,
if tdx_reclaim_page() return an error code, the warning will be triggered. After
the change, the warning is covered by the TDX_BUG_ON_3(), which is deeper in the
stack. So it's clearer that tdx_reclaim_page() failure is not handled silently.

> -		/*
> -		 * Failed to free page table page in mirror page table and
> -		 * there is nothing to do further.
> -		 * Intentionally leak the page to prevent the kernel from
> -		 * accessing the encrypted page.
> -		 */
> -		sp->external_spt = NULL;
> -	}
> +	if (is_mirror_sp(sp))
> +		kvm_x86_call(free_external_spt)(kvm, base_gfn, sp);
>  
>  	call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
>  }
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index bfbadba8bc08..d064b40a6b31 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -1765,27 +1765,24 @@ static void tdx_track(struct kvm *kvm)
>  	kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE);
>  }
>  
> -static int tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn,
> -				     enum pg_level level, void *private_spt)
> +static void tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn,

gfn is also not used in the function right now.
Also, since sp is passed now, the gfn can be got from sp->gfn, should gfn
also be dropped?


> +					struct kvm_mmu_page *sp)
>  {
> -	struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
> -
>  	/*
> -	 * free_external_spt() is only called after hkid is freed when TD is
> -	 * tearing down.
>  	 * KVM doesn't (yet) zap page table pages in mirror page table while
>  	 * TD is active, though guest pages mapped in mirror page table could be
>  	 * zapped during TD is active, e.g. for shared <-> private conversion
>  	 * and slot move/deletion.
> +	 *
> +	 * In other words, KVM should only free mirror page tables after the
> +	 * TD's hkid is freed, when the TD is being torn down.
> +	 *
> +	 * If the S-EPT PTE can't be removed for any reason, intentionally leak
> +	 * the page to prevent the kernel from accessing the encrypted page.
>  	 */
> -	if (KVM_BUG_ON(is_hkid_assigned(kvm_tdx), kvm))
> -		return -EIO;
> -
> -	/*
> -	 * The HKID assigned to this TD was already freed and cache was
> -	 * already flushed. We don't have to flush again.
> -	 */
> -	return tdx_reclaim_page(virt_to_page(private_spt));
> +	if (KVM_BUG_ON(is_hkid_assigned(to_kvm_tdx(kvm)), kvm) ||
> +	    tdx_reclaim_page(virt_to_page(sp->external_spt)))
> +		sp->external_spt = NULL;
>  }
>  
>  static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn,


  reply	other threads:[~2026-04-09  2:08 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-27 20:14 [PATCH 00/17] TDX MMU refactors Rick Edgecombe
2026-03-27 20:14 ` [PATCH 01/17] x86/tdx: Use pg_level in TDX APIs, not the TDX-Module's 0-based level Rick Edgecombe
2026-03-27 20:14 ` [PATCH 02/17] KVM: x86/mmu: Update iter->old_spte if cmpxchg64 on mirror SPTE "fails" Rick Edgecombe
2026-03-31  9:47   ` Huang, Kai
2026-03-31  9:17     ` Yan Zhao
2026-03-31  9:59       ` Huang, Kai
2026-03-31  9:22         ` Yan Zhao
2026-03-31 10:14           ` Huang, Kai
2026-03-27 20:14 ` [PATCH 03/17] KVM: TDX: Account all non-transient page allocations for per-TD structures Rick Edgecombe
2026-03-27 20:14 ` [PATCH 04/17] KVM: x86: Make "external SPTE" ops that can fail RET0 static calls Rick Edgecombe
2026-03-27 20:14 ` [PATCH 05/17] KVM: x86/tdp_mmu: Drop zapping KVM_BUG_ON() set_external_spte_present() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 06/17] KVM: x86/tdp_mmu: Morph the !is_frozen_spte() check into a KVM_MMU_WARN_ON() Rick Edgecombe
2026-03-30  5:00   ` Yan Zhao
2026-03-31 16:37     ` Edgecombe, Rick P
2026-04-02  1:06       ` Yan Zhao
2026-04-02 19:21         ` Sean Christopherson
2026-04-03  2:47           ` Yan Zhao
2026-03-27 20:14 ` [PATCH 07/17] KVM: x86/tdp_mmu: Centralize updates to present external PTEs Rick Edgecombe
2026-03-30  6:14   ` Yan Zhao
2026-04-01 23:45     ` Edgecombe, Rick P
2026-04-02  1:59       ` Yan Zhao
2026-04-02 23:10         ` Edgecombe, Rick P
2026-04-02 23:28           ` Sean Christopherson
2026-04-03  9:05             ` Yan Zhao
2026-04-04  0:15               ` Edgecombe, Rick P
2026-04-07  8:34                 ` Yan Zhao
2026-04-07 17:21                   ` Edgecombe, Rick P
2026-04-08  1:23                     ` Yan Zhao
2026-04-03  9:08           ` Yan Zhao
2026-03-31 10:09   ` Huang, Kai
2026-04-01 23:58     ` Edgecombe, Rick P
2026-04-02 23:21       ` Sean Christopherson
2026-04-01  8:34   ` Yan Zhao
2026-04-02 23:46     ` Edgecombe, Rick P
2026-04-03 10:33       ` Yan Zhao
2026-04-08  1:50         ` Yan Zhao
2026-04-08 10:47   ` Binbin Wu
2026-03-27 20:14 ` [PATCH 08/17] KVM: TDX: Drop kvm_x86_ops.link_external_spt(), use .set_external_spte() for all Rick Edgecombe
2026-03-30  6:28   ` Yan Zhao
2026-03-27 20:14 ` [PATCH 09/17] KVM: TDX: Add helper to handle mapping leaf SPTE into S-EPT Rick Edgecombe
2026-03-30  6:43   ` Yan Zhao
2026-04-01 23:59     ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 10/17] KVM: TDX: Move set_external_spte_present() assert into TDX code Rick Edgecombe
2026-03-31 10:30   ` Huang, Kai
2026-04-02  0:00     ` Edgecombe, Rick P
2026-03-31 10:34   ` Huang, Kai
2026-03-27 20:14 ` [PATCH 11/17] KVM: x86/mmu: Fold set_external_spte_present() into its sole caller Rick Edgecombe
2026-03-31 10:36   ` Huang, Kai
2026-04-01  7:41   ` Yan Zhao
2026-03-27 20:14 ` [PATCH 12/17] KVM: x86/mmu: Plumb the old_spte into kvm_x86_ops.set_external_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 13/17] KVM: TDX: Hoist tdx_sept_remove_private_spte() above set_private_spte() Rick Edgecombe
2026-03-31 10:42   ` Huang, Kai
2026-04-02  0:04     ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 14/17] KVM: x86/mmu: Remove KVM_BUG_ON() that checks lock when removing PTs Rick Edgecombe
2026-03-30  7:01   ` Yan Zhao
2026-03-31 10:46     ` Huang, Kai
2026-04-02  0:08       ` Edgecombe, Rick P
2026-04-02  2:04         ` Yan Zhao
2026-03-27 20:14 ` [PATCH 15/17] KVM: TDX: Handle removal of leaf SPTEs in .set_private_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 16/17] KVM: x86: Move error handling inside free_external_spt() Rick Edgecombe
2026-04-09  2:08   ` Binbin Wu [this message]
2026-03-27 20:14 ` [PATCH 17/17] KVM: TDX: Move external page table freeing to TDX code Rick Edgecombe
2026-03-30  7:49   ` Yan Zhao
2026-04-02  0:17     ` Edgecombe, Rick P
2026-04-02  2:16       ` Yan Zhao
2026-04-02  2:17         ` Yan Zhao
2026-03-31 11:02   ` Huang, Kai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5da1feaa-a376-4586-9593-12eff82e0b3d@linux.intel.com \
    --to=binbin.wu@linux.intel.com \
    --cc=dave.hansen@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=x86@kernel.org \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox