From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "Zhao, Yan Y" <yan.y.zhao@intel.com>
Cc: "Hansen, Dave" <dave.hansen@intel.com>,
"seanjc@google.com" <seanjc@google.com>,
"Huang, Kai" <kai.huang@intel.com>,
"kas@kernel.org" <kas@kernel.org>,
"x86@kernel.org" <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"pbonzini@redhat.com" <pbonzini@redhat.com>
Subject: Re: [PATCH 07/17] KVM: x86/tdp_mmu: Centralize updates to present external PTEs
Date: Wed, 1 Apr 2026 23:45:54 +0000 [thread overview]
Message-ID: <8a107d4da92d4cf910f9a70991a0e67b42e04d4f.camel@intel.com> (raw)
In-Reply-To: <acoUt9vy8OPU1SW9@yzhao56-desk.sh.intel.com>
On Mon, 2026-03-30 at 14:14 +0800, Yan Zhao wrote:
> > u64 new_spte)
> > {
> > + struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(iter-
> > >sptep));
> > int ret;
> >
> > lockdep_assert_held_read(&kvm->mmu_lock);
> >
> > - ret = __tdp_mmu_set_spte_atomic(kvm, iter, new_spte);
> > + /* KVM should never freeze SPTEs using higher level APIs. */
> "higher level API" is ambiguous. e.g. kvm_tdp_mmu_write_spte_atomic() allows
> new_spte to be FROZEN_SPTE.
Yea you are right. It felt too fuzzy but I couldn't think of a better word.
>
> What about just "callers of tdp_mmu_set_spte_atomic() should not freeze SPTEs
> directly"?
Sure.
>
> > + KVM_MMU_WARN_ON(is_frozen_spte(new_spte));
> > +
> > + /*
> > + * Temporarily freeze the SPTE until the external PTE operation has
> > + * completed (unless the new SPTE itself will be frozen), e.g. so
> > that
> > + * concurrent faults don't attempt to install a child PTE in the
> > + * external page table before the parent PTE has been written, or
> > try
> > + * to re-install a page table before the old one was removed.
> > + */
> > + if (is_mirror_sptep(iter->sptep))
> > + ret = __tdp_mmu_set_spte_atomic(kvm, iter, FROZEN_SPTE);
> > + else
> > + ret = __tdp_mmu_set_spte_atomic(kvm, iter, new_spte);
> > if (ret)
> > return ret;
> >
> > - handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
> > - new_spte, iter->level, true);
> > + ret = __handle_changed_spte(kvm, sp, iter->gfn, iter->old_spte,
> > + new_spte, iter->level, true);
>
> What about adding a comment for the tricky part for the mirror page table:
> while new_spte is set to FROZEN_SPTE in the above __tdp_mmu_set_spte_atomic()
You meant it sets iter->sptep I think.
> for freezing the mirror page table, the original new_spte from the caller of
> tdp_mmu_set_spte_atomic() is passed to __handle_changed_spte() in order to
> properly update statistics and propagate to the external page table.
new_spte was already passed in. What changed? You mean that
__tdp_mmu_set_spte_atomic() sets iter->sptep and doesn't update new_spte? If so
I'm not sure if it threshold TDP MMU.
>
> > - return 0;
> > + /*
> > + * Unfreeze the mirror SPTE. If updating the external SPTE failed,
> > + * restore the old SPTE so that the SPTE isn't frozen in
> > perpetuity,
> > + * otherwise set the mirror SPTE to the new desired value.
> > + */
> > + if (is_mirror_sptep(iter->sptep)) {
> > + if (ret)
> > + __kvm_tdp_mmu_write_spte(iter->sptep, iter-
> > >old_spte);
> > + else
> > + __kvm_tdp_mmu_write_spte(iter->sptep, new_spte);
> > + } else {
> > + /*
> > + * Bug the VM if handling the change failed, as failure is
> > only
> > + * allowed if KVM couldn't update the external SPTE.
> > + */
> > + KVM_BUG_ON(ret, kvm);
> > + }
> > + return ret;
> > }
> >
> > /*
> > @@ -738,6 +759,8 @@ static inline int __must_check
> > tdp_mmu_set_spte_atomic(struct kvm *kvm,
> > static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
> > u64 old_spte, u64 new_spte, gfn_t gfn, int
> > level)
> > {
> > + struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(sptep));
> > +
> > lockdep_assert_held_write(&kvm->mmu_lock);
> >
> > /*
> > @@ -751,7 +774,7 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id,
> > tdp_ptep_t sptep,
> >
> > old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte,
> > level);
> >
> > - handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level,
> > false);
> > + handle_changed_spte(kvm, sp, gfn, old_spte, new_spte, level,
> > false);
> >
> > /*
> > * Users that do non-atomic setting of PTEs don't operate on mirror
> > @@ -1373,6 +1396,9 @@ static void kvm_tdp_mmu_age_spte(struct kvm *kvm,
> > struct tdp_iter *iter)
> > {
> > u64 new_spte;
> >
> > + if (WARN_ON_ONCE(is_mirror_sptep(iter->sptep)))
> > + return;
> > +
> Add a comment for why mirror page table is not expected here?
Ehh, maybe. Thinking about what to put... The warning is kind of cheating a
little bit on the idea of the patch: to forward all changes through limited ops
in a central place, such that we don't have TDX specifics encoded in core MMU.
Trying to forward this through properly would result in more burden to the TDP
MMU, so that's not the right answer either.
"Mirror TDP doesn't support PTE aging" is a pretty obvious comment. I'm fine
just leaving it without comment, but I can add something like that. Or do you
have another suggestion?
>
> And do we need a similar WARN_ON_ONCE() in kvm_tdp_mmu_clear_dirty_pt_masked()
> or clear_dirty_pt_masked()?
Nothing changes for those in this patch though? For the kvm_tdp_mmu_age_spte()
case, warning coverage is removed in this patch.
>
> > if (spte_ad_enabled(iter->old_spte)) {
> > iter->old_spte = tdp_mmu_clear_spte_bits_atomic(iter-
> > >sptep,
> > shadow_acce
> > ssed_mask);
> > --
> > 2.53.0
next prev parent reply other threads:[~2026-04-01 23:46 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-27 20:14 [PATCH 00/17] TDX MMU refactors Rick Edgecombe
2026-03-27 20:14 ` [PATCH 01/17] x86/tdx: Use pg_level in TDX APIs, not the TDX-Module's 0-based level Rick Edgecombe
2026-03-27 20:14 ` [PATCH 02/17] KVM: x86/mmu: Update iter->old_spte if cmpxchg64 on mirror SPTE "fails" Rick Edgecombe
2026-03-31 9:47 ` Huang, Kai
2026-03-31 9:17 ` Yan Zhao
2026-03-31 9:59 ` Huang, Kai
2026-03-31 9:22 ` Yan Zhao
2026-03-31 10:14 ` Huang, Kai
2026-03-27 20:14 ` [PATCH 03/17] KVM: TDX: Account all non-transient page allocations for per-TD structures Rick Edgecombe
2026-03-27 20:14 ` [PATCH 04/17] KVM: x86: Make "external SPTE" ops that can fail RET0 static calls Rick Edgecombe
2026-03-27 20:14 ` [PATCH 05/17] KVM: x86/tdp_mmu: Drop zapping KVM_BUG_ON() set_external_spte_present() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 06/17] KVM: x86/tdp_mmu: Morph the !is_frozen_spte() check into a KVM_MMU_WARN_ON() Rick Edgecombe
2026-03-30 5:00 ` Yan Zhao
2026-03-31 16:37 ` Edgecombe, Rick P
2026-04-02 1:06 ` Yan Zhao
2026-04-02 19:21 ` Sean Christopherson
2026-04-03 2:47 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 07/17] KVM: x86/tdp_mmu: Centralize updates to present external PTEs Rick Edgecombe
2026-03-30 6:14 ` Yan Zhao
2026-04-01 23:45 ` Edgecombe, Rick P [this message]
2026-04-02 1:59 ` Yan Zhao
2026-04-02 23:10 ` Edgecombe, Rick P
2026-04-02 23:28 ` Sean Christopherson
2026-04-03 9:05 ` Yan Zhao
2026-04-04 0:15 ` Edgecombe, Rick P
2026-04-07 8:34 ` Yan Zhao
2026-04-07 17:21 ` Edgecombe, Rick P
2026-04-08 1:23 ` Yan Zhao
2026-04-03 9:08 ` Yan Zhao
2026-03-31 10:09 ` Huang, Kai
2026-04-01 23:58 ` Edgecombe, Rick P
2026-04-02 23:21 ` Sean Christopherson
2026-04-01 8:34 ` Yan Zhao
2026-04-02 23:46 ` Edgecombe, Rick P
2026-04-03 10:33 ` Yan Zhao
2026-04-08 1:50 ` Yan Zhao
2026-04-08 10:47 ` Binbin Wu
2026-03-27 20:14 ` [PATCH 08/17] KVM: TDX: Drop kvm_x86_ops.link_external_spt(), use .set_external_spte() for all Rick Edgecombe
2026-03-30 6:28 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 09/17] KVM: TDX: Add helper to handle mapping leaf SPTE into S-EPT Rick Edgecombe
2026-03-30 6:43 ` Yan Zhao
2026-04-01 23:59 ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 10/17] KVM: TDX: Move set_external_spte_present() assert into TDX code Rick Edgecombe
2026-03-31 10:30 ` Huang, Kai
2026-04-02 0:00 ` Edgecombe, Rick P
2026-03-31 10:34 ` Huang, Kai
2026-03-27 20:14 ` [PATCH 11/17] KVM: x86/mmu: Fold set_external_spte_present() into its sole caller Rick Edgecombe
2026-03-31 10:36 ` Huang, Kai
2026-04-01 7:41 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 12/17] KVM: x86/mmu: Plumb the old_spte into kvm_x86_ops.set_external_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 13/17] KVM: TDX: Hoist tdx_sept_remove_private_spte() above set_private_spte() Rick Edgecombe
2026-03-31 10:42 ` Huang, Kai
2026-04-02 0:04 ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 14/17] KVM: x86/mmu: Remove KVM_BUG_ON() that checks lock when removing PTs Rick Edgecombe
2026-03-30 7:01 ` Yan Zhao
2026-03-31 10:46 ` Huang, Kai
2026-04-02 0:08 ` Edgecombe, Rick P
2026-04-02 2:04 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 15/17] KVM: TDX: Handle removal of leaf SPTEs in .set_private_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 16/17] KVM: x86: Move error handling inside free_external_spt() Rick Edgecombe
2026-04-09 2:08 ` Binbin Wu
2026-03-27 20:14 ` [PATCH 17/17] KVM: TDX: Move external page table freeing to TDX code Rick Edgecombe
2026-03-30 7:49 ` Yan Zhao
2026-04-02 0:17 ` Edgecombe, Rick P
2026-04-02 2:16 ` Yan Zhao
2026-04-02 2:17 ` Yan Zhao
2026-03-31 11:02 ` Huang, Kai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8a107d4da92d4cf910f9a70991a0e67b42e04d4f.camel@intel.com \
--to=rick.p.edgecombe@intel.com \
--cc=dave.hansen@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=x86@kernel.org \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox