From: Yan Zhao <yan.y.zhao@intel.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: "seanjc@google.com" <seanjc@google.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"Hansen, Dave" <dave.hansen@intel.com>,
"kas@kernel.org" <kas@kernel.org>,
"Huang, Kai" <kai.huang@intel.com>,
"x86@kernel.org" <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 07/17] KVM: x86/tdp_mmu: Centralize updates to present external PTEs
Date: Tue, 7 Apr 2026 16:34:16 +0800 [thread overview]
Message-ID: <adTBiKPanE15vtjS@yzhao56-desk.sh.intel.com> (raw)
In-Reply-To: <d4319635e971e38534509b7424c6c08fa729209c.camel@intel.com>
On Sat, Apr 04, 2026 at 08:15:16AM +0800, Edgecombe, Rick P wrote:
> On Fri, 2026-04-03 at 17:05 +0800, Yan Zhao wrote:
> > Hmm, sorry for the confusion. I didn't express it clearly.
> >
> > The ordering inside tdp_mmu_set_spte_atomic() for mirror root is:
> >
> > Before this patch,
> > 1. set mirror SPTE to frozen
> > 2. invoke TDX op to update external PTE
> > 3. set mirror SPTE to new_spte or restore old_spte
> > 4. if 2 succeeds, invoke handle_changed_spte() to propagate changes to
> > child mirror SPTEs and child external PTEs
> >
> > After this patch,
> > 1. set mirror SPTE to frozen
> > 2. invoke __handle_changed_spte(), which propagates changes to
> > (1) child mirror SPTEs and child external PTEs
> > (2) external PTE
> > 3. set mirror SPTE to new_spte or restore old_spte
> >
> > So, the step to propagate changes to child mirror SPTEs and child external PTEs
> > now occurs before the steps to update the external PTE and the mirror SPTE.
>
> How about I add this info to the log. I think you are right, it's an important
Ok.
> change to call out.
>
> Now it's making me think... Can you (if you haven't already) scrutinize for
> races/reasons that may trigger the KVM_BUG_ON() in handle_changed_spte() due to
> BUSY or other? Like in the handle_removed_pt() path. I guess the write lock
> saves us?
>
> Hmm... zero step?
Below is the list of all handle_changed_spte()-related scenarios.
Legends:
NP: !shadow-present SPTE
P: shadow-present SPTE
X: Yes or No.
shared: hold shared mmu_lock or not
valid: valid scenario in TDP MMU
mirror root allowed: could this scenario occur in the mirror root (after basic
TDX huge page support)
KVM_BUG_ON() hittable: is KVM_BUG_ON() in handle_changed_spte() hittable.
Scenarios 1-5 are for mapping,
Scenarios 6-10 are for shadow-present to shadow-present transitions.
Scenarios 11-14 are for zapping.
|mirror root| KVM_BUG_ON()
# | old_spte | new_spte | shared | valid | allowed | hittable
---|-----------|-----------|--------|-------|-----------|-----------------------
1 | NP | leaf P | Y | Y | Y | N
2 | NP | leaf P | N | Y | N | Y (a1)
3 | NP | nonleaf P | Y | Y | Y | N
4 | NP | nonleaf P | N | Y | N | Y (a2)
5 | NP | NP | X | N | N | warn !mmio_spte && !frozen_spte
---|-----------|-----------|--------|-------|-----------|-----------------------
6 | leaf P | leaf P | X | N | X | N
7 | leaf P | nonleaf P | Y | Y | N | N
8 | leaf P | nonleaf P | N | Y | Y | N
9 | nonleaf P | leaf P | X | Y | N | Y (b)
10 | nonleaf P | nonleaf P | X | N | X | warn pfn_changed
---|-----------|-----------|--------|-------|-----------|-----------------------
11 | leaf P | NP | Y | Y | N | N (c)
12 | leaf P | NP | N | Y | Y | N
13 | nonleaf P | NP | Y | Y | N | Y (d)
14 | nonleaf P | NP | N | Y | Y | N
Currently, only 4 scenarios (a1),(a2), (b), (d) may trigger KVM_BUG_ON() in
handle_changed_spte(), but none of them are currently reachable by mirror root.
(a1)(a2) May hit KVM_BUG_ON() in handle_changed_spte() if
tdx_sept_set_private_spte() fails due to contentions. e.g.,
tdh_mem_sept_add(), tdx_mem_page_aug(), or tdx_mem_page_add() may
contend with tdh_vp_enter() due to zero-step mitigation or may
potentially contend with TDCALLs.
(b) Promotion case. Currently unreachable in mirror root.
Need more complex changes in TDP MMU if we want to support it in the future.
(c) Will not hit KVM_BUG_ON() in TDP MMU, but will trigger warnings in
tdx_sept_remove_private_spte() due to lockdep_assert_held_write() or
TDX_BUG_ON() caused by concurrent BLOCK, TRACK, REMOVE.
(d) May hit the KVM_BUG_ON() in handle_changed_spte() due to failure to remove
child S-EPT entries and will trigger warnings in
tdx_sept_remove_private_spte() due to lockdep_assert_held_write() or
TDX_BUG_ON() caused by concurrent BLOCK, TRACK, REMOVE.
May also trigger TDX_BUG_ON() in tdx_sept_reclaim_private_spt().
> > > Ya, I'm a bit confused too. For me, the "tricky" part is understanding the need
> > > to set the mirror SPTE to FROZE_SPTE while updating the external SPTE. Once that
> > > is understood, I don't find passing in @new_spte to be surprising in any way.
> > I still find it tricky because it seems strange to me to invoke a function named
> > handle_changed_spte() before the change actually occurs on the SPTE (i.e.,to me,
> > the SPTE has only changed from xxx to FROZEN_SPTE, but handle_changed_spte()
> > handles changes from xxx to new_spte).
> >
> > Besides, another tricky point (currently benign to TDX) is that:
> > before this patch, tdp_mmu_set_spte_atomic() cannot be used to atomically zap
> > non-leaf mirror SPTEs, since TDX requires child PTEs to be zapped before the
> > parent PTE;
> > after this patch, performing atomic zapping of non-leaf mirror SPTEs seems to be
> > allowed in TDP MMU since the above step 2.1 now occurs before step 2.2. However,
> > if step 2.2 fails after step 2.1 succeeds, step 3 cannot easily restore the real
> > old state.
> > So, if we allow atomic zap on the mirror root in the future, it looks like we
> > need to ensure atomic zapping of S-EPT cannot fail.
>
> I think we shouldn't comment these kind of TDX specifics. It won't confuse non-
> TDX developers working in the TDP MMU I think.
Ok.
next prev parent reply other threads:[~2026-04-07 9:14 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-27 20:14 [PATCH 00/17] TDX MMU refactors Rick Edgecombe
2026-03-27 20:14 ` [PATCH 01/17] x86/tdx: Use pg_level in TDX APIs, not the TDX-Module's 0-based level Rick Edgecombe
2026-03-27 20:14 ` [PATCH 02/17] KVM: x86/mmu: Update iter->old_spte if cmpxchg64 on mirror SPTE "fails" Rick Edgecombe
2026-03-31 9:47 ` Huang, Kai
2026-03-31 9:17 ` Yan Zhao
2026-03-31 9:59 ` Huang, Kai
2026-03-31 9:22 ` Yan Zhao
2026-03-31 10:14 ` Huang, Kai
2026-03-27 20:14 ` [PATCH 03/17] KVM: TDX: Account all non-transient page allocations for per-TD structures Rick Edgecombe
2026-03-27 20:14 ` [PATCH 04/17] KVM: x86: Make "external SPTE" ops that can fail RET0 static calls Rick Edgecombe
2026-03-27 20:14 ` [PATCH 05/17] KVM: x86/tdp_mmu: Drop zapping KVM_BUG_ON() set_external_spte_present() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 06/17] KVM: x86/tdp_mmu: Morph the !is_frozen_spte() check into a KVM_MMU_WARN_ON() Rick Edgecombe
2026-03-30 5:00 ` Yan Zhao
2026-03-31 16:37 ` Edgecombe, Rick P
2026-04-02 1:06 ` Yan Zhao
2026-04-02 19:21 ` Sean Christopherson
2026-04-03 2:47 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 07/17] KVM: x86/tdp_mmu: Centralize updates to present external PTEs Rick Edgecombe
2026-03-30 6:14 ` Yan Zhao
2026-04-01 23:45 ` Edgecombe, Rick P
2026-04-02 1:59 ` Yan Zhao
2026-04-02 23:10 ` Edgecombe, Rick P
2026-04-02 23:28 ` Sean Christopherson
2026-04-03 9:05 ` Yan Zhao
2026-04-04 0:15 ` Edgecombe, Rick P
2026-04-07 8:34 ` Yan Zhao [this message]
2026-04-07 17:21 ` Edgecombe, Rick P
2026-04-08 1:23 ` Yan Zhao
2026-04-03 9:08 ` Yan Zhao
2026-03-31 10:09 ` Huang, Kai
2026-04-01 23:58 ` Edgecombe, Rick P
2026-04-02 23:21 ` Sean Christopherson
2026-04-01 8:34 ` Yan Zhao
2026-04-02 23:46 ` Edgecombe, Rick P
2026-04-03 10:33 ` Yan Zhao
2026-04-08 1:50 ` Yan Zhao
2026-04-08 10:47 ` Binbin Wu
2026-03-27 20:14 ` [PATCH 08/17] KVM: TDX: Drop kvm_x86_ops.link_external_spt(), use .set_external_spte() for all Rick Edgecombe
2026-03-30 6:28 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 09/17] KVM: TDX: Add helper to handle mapping leaf SPTE into S-EPT Rick Edgecombe
2026-03-30 6:43 ` Yan Zhao
2026-04-01 23:59 ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 10/17] KVM: TDX: Move set_external_spte_present() assert into TDX code Rick Edgecombe
2026-03-31 10:30 ` Huang, Kai
2026-04-02 0:00 ` Edgecombe, Rick P
2026-03-31 10:34 ` Huang, Kai
2026-03-27 20:14 ` [PATCH 11/17] KVM: x86/mmu: Fold set_external_spte_present() into its sole caller Rick Edgecombe
2026-03-31 10:36 ` Huang, Kai
2026-04-01 7:41 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 12/17] KVM: x86/mmu: Plumb the old_spte into kvm_x86_ops.set_external_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 13/17] KVM: TDX: Hoist tdx_sept_remove_private_spte() above set_private_spte() Rick Edgecombe
2026-03-31 10:42 ` Huang, Kai
2026-04-02 0:04 ` Edgecombe, Rick P
2026-03-27 20:14 ` [PATCH 14/17] KVM: x86/mmu: Remove KVM_BUG_ON() that checks lock when removing PTs Rick Edgecombe
2026-03-30 7:01 ` Yan Zhao
2026-03-31 10:46 ` Huang, Kai
2026-04-02 0:08 ` Edgecombe, Rick P
2026-04-02 2:04 ` Yan Zhao
2026-03-27 20:14 ` [PATCH 15/17] KVM: TDX: Handle removal of leaf SPTEs in .set_private_spte() Rick Edgecombe
2026-03-27 20:14 ` [PATCH 16/17] KVM: x86: Move error handling inside free_external_spt() Rick Edgecombe
2026-04-09 2:08 ` Binbin Wu
2026-03-27 20:14 ` [PATCH 17/17] KVM: TDX: Move external page table freeing to TDX code Rick Edgecombe
2026-03-30 7:49 ` Yan Zhao
2026-04-02 0:17 ` Edgecombe, Rick P
2026-04-02 2:16 ` Yan Zhao
2026-04-02 2:17 ` Yan Zhao
2026-03-31 11:02 ` Huang, Kai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adTBiKPanE15vtjS@yzhao56-desk.sh.intel.com \
--to=yan.y.zhao@intel.com \
--cc=dave.hansen@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.