From: sashiko-bot@kernel.org
To: "Yan Zhao" <yan.y.zhao@intel.com>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH v3 09/15] KVM: x86/tdp_mmu: Centrally propagate to-present/atomic zap updates to external PTEs
Date: Thu, 28 May 2026 09:52:51 +0000 [thread overview]
Message-ID: <20260528095251.669351F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260528081225.10348-1-yan.y.zhao@intel.com>
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Mutually exclusive `else if` logic in `__handle_changed_spte()` prevents external page table updates for parent SPTEs, and incorrect ordering creates a use-after-free hazard on error rollback.
- [High] Propagating deterministic errors from `set_external_spte()` causes an infinite retry loop in callers of `tdp_mmu_set_spte_atomic()`.
--
commit e8401f58d2841529854a98f404633950aa40f76c
Author: Sean Christopherson <seanjc@google.com>
KVM: x86/tdp_mmu: Centrally propagate to-present/atomic zap updates to external PTEs
This commit moves the propagation of to-present changes and atomic zap
changes to external PTEs from __tdp_mmu_set_spte_atomic() to
__handle_changed_spte(), centralizing SPTE change handling.
Link: https://lore.kernel.org/lkml/aYYn0nf2cayYu8e7@google.com [1]
Link: https://lore.kernel.org/kvm/aY_Dq3riRLUNFZfr@google.com [2]
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
[ ... ]
> @@ -578,29 +576,51 @@ static void handle_changed_spte(struct kvm *kvm, struct kvm_mmu_page *sp,
[ ... ]
> /*
> * Recursively handle child PTs if the change removed a subtree from
> * the paging structure. Note the WARN on the PFN changing without the
> * SPTE being converted to a hugepage (leaf) or being zapped. Shadow
> * pages are kernel allocations and should never be migrated.
> + *
> + * For the mirror page table, propagate changes to present or changes of
> + * leaf SPTEs to !present under shared mmu_lock to the external SPTE via
> + * set_external_spte() op.
> */
> if (was_present && !was_leaf &&
> - (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed)))
> + (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed))) {
> handle_removed_pt(kvm, spte_to_child_pt(old_spte, level), shared);
> + } else if (is_mirror_sp(sp) && (is_present || shared)) {
> + int r;
> +
> + r = kvm_x86_call(set_external_spte)(kvm, gfn, old_spte, new_spte, level);
> + if (r)
> + return r;
> + }
[Severity: Critical]
Does this mutually exclusive else if logic skip external page table updates
for parent SPTEs?
When replacing a non-leaf mirror SPTE with a leaf SPTE (such as during a
hugepage collapse) or performing a shared atomic zap on a non-leaf SPTE, the
first condition is met and handle_removed_pt() is called. Because of the
else if, the set_external_spte() call is skipped entirely, leaving the
hardware external page table desynchronized.
Additionally, if this is changed to an independent if block, could the order
of operations create a use-after-free on error rollback?
If handle_removed_pt() executes first, it irreversibly unlinks and schedules
child page tables for RCU freeing. If set_external_spte() then runs and
fails (e.g. TDX rejects it), __handle_changed_spte() returns an error. The
caller tdp_mmu_set_spte_atomic() will attempt to roll back by restoring the
parent SPTE to its old value. However, since the child page tables were
already freed, the restored SPTE would point to RCU-freed memory.
Should set_external_spte() be called before handle_removed_pt() to allow a
safe rollback?
> @@ -673,14 +668,51 @@ static inline int __must_check tdp_mmu_set_spte_atomic(struct kvm *kvm,
[ ... ]
> + ret = __handle_changed_spte(kvm, sp, iter->gfn, iter->old_spte,
> + new_spte, iter->level, true);
> + /*
> + * Unfreeze the mirror SPTE. If updating the external SPTE failed,
> + * restore the old value so that the mirror SPTE isn't frozen in
> + * perpetuity, otherwise set the mirror SPTE to the new desired value.
> + */
> + if (is_mirror_sptep(iter->sptep)) {
> + if (ret)
> + __kvm_tdp_mmu_write_spte(iter->sptep, iter->old_spte);
> + else
> + __kvm_tdp_mmu_write_spte(iter->sptep, new_spte);
> + } else {
> + /*
> + * Bug the VM if handling the change failed, as failure is only
> + * allowed if KVM couldn't update the external SPTE.
> + */
> + KVM_BUG_ON(ret, kvm);
> + }
> + return ret;
> }
[Severity: High]
Will propagating deterministic error codes from set_external_spte() cause
infinite retry loops in callers of tdp_mmu_set_spte_atomic()?
Callers of tdp_mmu_set_spte_atomic() generally assume that any non-zero
return value is a transient cmpxchg race (which typically returns -EBUSY),
and they implement an unconditional retry pattern on failure.
If set_external_spte() rejects an operation and returns a deterministic
error like -EINVAL (e.g. rejecting an atomic zap as mentioned in the commit
message) or -ENOMEM, this error is propagated back. Because callers do not
check the specific error code before retrying, won't this result in a
kernel livelock where the caller infinitely loops attempting the same doomed
operation?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260528080856.10141-1-yan.y.zhao@intel.com?part=9
next prev parent reply other threads:[~2026-05-28 9:52 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-28 8:08 [PATCH v3 00/15] TDX MMU refactors Yan Zhao
2026-05-28 8:10 ` [PATCH v3 01/15] KVM: TDX: Drop kvm_x86_ops.link_external_spt() Yan Zhao
2026-05-28 8:11 ` [PATCH v3 02/15] KVM: TDX: Wrap mapping of leaf and non-leaf S-EPT entries into helpers Yan Zhao
2026-05-28 8:11 ` [PATCH v3 03/15] KVM: x86/mmu: Fold set_external_spte_present() into its sole caller Yan Zhao
2026-05-28 8:11 ` [PATCH v3 04/15] KVM: x86/mmu: Plumb param "old_spte" into kvm_x86_ops.set_external_spte() Yan Zhao
2026-05-28 8:11 ` [PATCH v3 05/15] KVM: TDX: Move KVM_BUG_ON()s in __tdp_mmu_set_spte_atomic() to TDX code Yan Zhao
2026-05-28 9:45 ` sashiko-bot
2026-05-28 8:11 ` [PATCH v3 06/15] KVM: TDX: Move lockdep assert " Yan Zhao
2026-05-28 8:12 ` [PATCH v3 07/15] KVM: x86/tdp_mmu: Morph !is_frozen_spte() check into a KVM_MMU_WARN_ON() Yan Zhao
2026-05-28 8:12 ` [PATCH v3 08/15] KVM: x86/mmu: Plumb "sp" _pointer_ into the TDP MMU's handle_changed_spte() Yan Zhao
2026-05-28 8:12 ` [PATCH v3 09/15] KVM: x86/tdp_mmu: Centrally propagate to-present/atomic zap updates to external PTEs Yan Zhao
2026-05-28 9:52 ` sashiko-bot [this message]
2026-05-28 8:12 ` [PATCH v3 10/15] KVM: x86/mmu: Drop KVM_BUG_ON() on shared lock to zap child " Yan Zhao
2026-05-28 8:12 ` [PATCH v3 11/15] KVM: TDX: Hoist tdx_sept_remove_private_spte() above set_private_spte() Yan Zhao
2026-05-28 8:12 ` [PATCH v3 12/15] KVM: TDX: Drop kvm_x86_ops.remove_external_spte() Yan Zhao
2026-05-28 8:13 ` [PATCH v3 13/15] KVM: TDX: Rename tdx_sept_remove_private_spte() to show it's for leaf SPTEs Yan Zhao
2026-05-28 8:13 ` [PATCH v3 14/15] KVM: x86: Move error handling inside free_external_spt() Yan Zhao
2026-05-28 8:13 ` [PATCH v3 15/15] KVM: TDX: Move external page table freeing to TDX code Yan Zhao
2026-05-28 13:03 ` [PATCH v3 00/15] TDX MMU refactors Sean Christopherson
2026-05-29 5:34 ` Yan Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260528095251.669351F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox