public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Huang, Kai" <kai.huang@intel.com>
To: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"kas@kernel.org" <kas@kernel.org>,
	"seanjc@google.com" <seanjc@google.com>,
	"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
	"Zhao, Yan Y" <yan.y.zhao@intel.com>
Cc: "Hansen, Dave" <dave.hansen@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH 02/17] KVM: x86/mmu: Update iter->old_spte if cmpxchg64 on mirror SPTE "fails"
Date: Tue, 31 Mar 2026 09:47:29 +0000	[thread overview]
Message-ID: <49cdf35c32e064ef5d6ca24bd4bb9d8b26bc2202.camel@intel.com> (raw)
In-Reply-To: <20260327201421.2824383-3-rick.p.edgecombe@intel.com>

On Fri, 2026-03-27 at 13:14 -0700, Rick Edgecombe wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> Pass a pointer to iter->old_spte, not simply its value, when setting an
> external SPTE in __tdp_mmu_set_spte_atomic(), so that the iterator's value
> will be updated if the cmpxchg64 to freeze the mirror SPTE fails.  The bug
> is currently benign as TDX is mutualy exclusive with all paths that do
> "local" retry", e.g. clear_dirty_gfn_range() and wrprot_gfn_range().
> 
> Fixes: 77ac7079e66d ("KVM: x86/tdp_mmu: Propagate building mirror page tables")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 7b1102d26f9c..dbaeb80f2b64 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -509,10 +509,10 @@ static void *get_external_spt(gfn_t gfn, u64 new_spte, int level)
>  }
>  
>  static int __must_check set_external_spte_present(struct kvm *kvm, tdp_ptep_t sptep,
> -						 gfn_t gfn, u64 old_spte,
> +						 gfn_t gfn, u64 *old_spte,
>  						 u64 new_spte, int level)
>  {
> -	bool was_present = is_shadow_present_pte(old_spte);
> +	bool was_present = is_shadow_present_pte(*old_spte);
>  	bool is_present = is_shadow_present_pte(new_spte);
>  	bool is_leaf = is_present && is_last_spte(new_spte, level);
>  	int ret = 0;
> @@ -525,7 +525,7 @@ static int __must_check set_external_spte_present(struct kvm *kvm, tdp_ptep_t sp
>  	 * page table has been modified. Use FROZEN_SPTE similar to
>  	 * the zapping case.
>  	 */
> -	if (!try_cmpxchg64(rcu_dereference(sptep), &old_spte, FROZEN_SPTE))
> +	if (!try_cmpxchg64(rcu_dereference(sptep), old_spte, FROZEN_SPTE))
>  		return -EBUSY;
>  
>  	/*
> @@ -541,7 +541,7 @@ static int __must_check set_external_spte_present(struct kvm *kvm, tdp_ptep_t sp
>  		ret = kvm_x86_call(link_external_spt)(kvm, gfn, level, external_spt);
>  	}
>  	if (ret)
> -		__kvm_tdp_mmu_write_spte(sptep, old_spte);
> +		__kvm_tdp_mmu_write_spte(sptep, *old_spte);
>  	else
>  		__kvm_tdp_mmu_write_spte(sptep, new_spte);
>  	return ret;
> @@ -670,7 +670,7 @@ static inline int __must_check __tdp_mmu_set_spte_atomic(struct kvm *kvm,
>  			return -EBUSY;
>  
>  		ret = set_external_spte_present(kvm, iter->sptep, iter->gfn,
> -						iter->old_spte, new_spte, iter->level);
> +						&iter->old_spte, new_spte, iter->level);
>  		if (ret)
>  			return ret;
>  	} else {

The __tdp_mmu_set_spte_atomic() has a WARN() at the beginning to check the
iter->old_spte isn't a frozen SPTE:

        WARN_ON_ONCE(iter->yielded || is_frozen_spte(iter->old_spte));

Thinking more, I _think_ this patch could potentially trigger this WARNING
due to now set_external_spte_present() will set iter->old_spte to
FROZEN_SPTE when try_cmpxchg64() fails.

Consider there are 3 vCPUs trying to accept the same GFN, and they all reach
__tdp_mmu_set_spte_atomic() simultaneously.  Assuming vCPU1 does the 

	if (!try_cmpxchg64(rcu_dereference(sptep), old_spte, FROZEN_SPTE))
 		return -EBUSY;

.. successfully in set_external_spte_present(), then vCPU2 will fail on the
try_cmpxchg64(), but this will cause iter->old_spte to be updated to
FROZEN_SPTE.

Then when vCPU3 enters __tdp_mmu_set_spte_atomic(), AFAICT the WARNING will
be triggered due to is_frozen_spte(iter->old_spte) will now return true.

Or did I miss anything?

Also, AFAICT this issue doesn't exist for non-TDX case because there's no
case tdp_mmu_set_spte_atomic() is called to set new_spte as FROZEN_SPTE in
such case.

  reply	other threads:[~2026-03-31  9:47 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-27 20:14 [PATCH 00/17] TDX MMU refactors Rick Edgecombe
2026-03-27 20:14 ` [PATCH 01/17] x86/tdx: Use pg_level in TDX APIs, not the TDX-Module's 0-based level Rick Edgecombe
2026-03-27 20:14 ` [PATCH 02/17] KVM: x86/mmu: Update iter->old_spte if cmpxchg64 on mirror SPTE "fails" Rick Edgecombe
2026-03-31  9:47   ` Huang, Kai [this message]
2026-03-31  9:17     ` Yan Zhao
2026-03-31  9:59       ` Huang, Kai
2026-03-31  9:22         ` Yan Zhao
2026-03-31 10:14           ` Huang, Kai
2026-03-27 20:14 ` [PATCH 03/17] KVM: TDX: Account all non-transient page allocations for per-TD structures Rick Edgecombe
2026-03-27 20:14 ` [PATCH 04/17] KVM: x86: Make "external SPTE" ops that can fail RET0 static calls Rick Edgecombe
2026-03-27 20:14 ` [PATCH 05/17] KVM: x86/tdp_mmu: Drop zapping KVM_BUG_ON() set_external_spte_present() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 06/17] KVM: x86/tdp_mmu: Morph the !is_frozen_spte() check into a KVM_MMU_WARN_ON() Rick Edgecombe
2026-03-30  5:00   ` Yan Zhao
2026-03-31 16:37     ` Edgecombe, Rick P
2026-04-02  1:06       ` Yan Zhao
2026-04-02 19:21         ` Sean Christopherson
2026-04-03  2:47           ` Yan Zhao
2026-03-27 20:14 ` [PATCH 07/17] KVM: x86/tdp_mmu: Centralize updates to present external PTEs Rick Edgecombe
2026-03-30  6:14   ` Yan Zhao
2026-04-01 23:45     ` Edgecombe, Rick P
2026-04-02  1:59       ` Yan Zhao
2026-04-02 23:10         ` Edgecombe, Rick P
2026-04-02 23:28           ` Sean Christopherson
2026-04-03  9:05             ` Yan Zhao
2026-04-04  0:15               ` Edgecombe, Rick P
2026-04-07  8:34                 ` Yan Zhao
2026-04-07 17:21                   ` Edgecombe, Rick P
2026-04-08  1:23                     ` Yan Zhao
2026-04-03  9:08           ` Yan Zhao
2026-03-31 10:09   ` Huang, Kai
2026-04-01 23:58     ` Edgecombe, Rick P
2026-04-02 23:21       ` Sean Christopherson
2026-04-01  8:34   ` Yan Zhao
2026-04-02 23:46     ` Edgecombe, Rick P
2026-04-03 10:33       ` Yan Zhao
2026-04-08  1:50         ` Yan Zhao
2026-04-08 10:47   ` Binbin Wu
2026-03-27 20:14 ` [PATCH 08/17] KVM: TDX: Drop kvm_x86_ops.link_external_spt(), use .set_external_spte() for all Rick Edgecombe
2026-03-30  6:28   ` Yan Zhao
2026-03-27 20:14 ` [PATCH 09/17] KVM: TDX: Add helper to handle mapping leaf SPTE into S-EPT Rick Edgecombe
2026-03-30  6:43   ` Yan Zhao
2026-04-01 23:59     ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 10/17] KVM: TDX: Move set_external_spte_present() assert into TDX code Rick Edgecombe
2026-03-31 10:30   ` Huang, Kai
2026-04-02  0:00     ` Edgecombe, Rick P
2026-03-31 10:34   ` Huang, Kai
2026-03-27 20:14 ` [PATCH 11/17] KVM: x86/mmu: Fold set_external_spte_present() into its sole caller Rick Edgecombe
2026-03-31 10:36   ` Huang, Kai
2026-04-01  7:41   ` Yan Zhao
2026-03-27 20:14 ` [PATCH 12/17] KVM: x86/mmu: Plumb the old_spte into kvm_x86_ops.set_external_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 13/17] KVM: TDX: Hoist tdx_sept_remove_private_spte() above set_private_spte() Rick Edgecombe
2026-03-31 10:42   ` Huang, Kai
2026-04-02  0:04     ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 14/17] KVM: x86/mmu: Remove KVM_BUG_ON() that checks lock when removing PTs Rick Edgecombe
2026-03-30  7:01   ` Yan Zhao
2026-03-31 10:46     ` Huang, Kai
2026-04-02  0:08       ` Edgecombe, Rick P
2026-04-02  2:04         ` Yan Zhao
2026-03-27 20:14 ` [PATCH 15/17] KVM: TDX: Handle removal of leaf SPTEs in .set_private_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 16/17] KVM: x86: Move error handling inside free_external_spt() Rick Edgecombe
2026-04-09  2:08   ` Binbin Wu
2026-03-27 20:14 ` [PATCH 17/17] KVM: TDX: Move external page table freeing to TDX code Rick Edgecombe
2026-03-30  7:49   ` Yan Zhao
2026-04-02  0:17     ` Edgecombe, Rick P
2026-04-02  2:16       ` Yan Zhao
2026-04-02  2:17         ` Yan Zhao
2026-03-31 11:02   ` Huang, Kai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49cdf35c32e064ef5d6ca24bd4bb9d8b26bc2202.camel@intel.com \
    --to=kai.huang@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=x86@kernel.org \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox