From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D43F53AFB1F for ; Thu, 28 May 2026 09:52:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779961973; cv=none; b=YRcKiv1ARqGLHOOYL6nzkd+2EICZll7DmGtJiHKwVrF+UJoA/o+JOmAea99nT23H2RCrsQCnfZ4tOLdJLlAGRS5eT9KSQshJtYHY2Edy+dtqLexki8q3A9e//sCh34bPpQjHUgD/bYbm5ET21yMYfYByLfXzR2Y9jLl/zPmr9/8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779961973; c=relaxed/simple; bh=R3pGkrNPTeHg8E9HeX+JUH5Kx59Y1n+O7Hz2fTZw7wU=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=CB4LLPHU1jb4cKyVIN2To+MRbgG6YkATTtcqchFJUDk81nb/JfQWScP55uNCggRcS6I3njSXND458GLHMdwgW1e5igIBBeBFAwJfWj66tov33rgdmxQSWmi4REHuredAvI/IveP6ZAN0f9OuAeS5gwuWKm3HyxeNozhuCD5XjrA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=YYoNtaJm; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YYoNtaJm" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 669351F000E9; Thu, 28 May 2026 09:52:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1779961971; bh=x0yqGLlXjgZAIHTHHhlDpIEBMiXZ/x6k6zteUR4bLcQ=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=YYoNtaJmOOBPc6MbpgBZk6ncDw3U/yTOcQarMNbgAe9MN+k2629KzUvQv3UPYD19r /Og6ED3hAvLzQuV0YA5cO4zqr/ExWMD/mAO7vca/L55uVaY3NhCFW4wS1MZD5jCKB7 hldjiOHKqd9swLianR9gwWC+BpbKt3ZsJ/aMgyAdoTl5/MdKTeoRulfVDLvolEgiPH KDPs06esssRtcaCys1lfQzE7117O7g996sfNo6v1MEEhj0J5e3soCFi4u71CSNESqT TdA8VsihyA1CTDAPVr+qdW2cyMjHuTL7tSahV1BQwjFIJBEI47VjkO+7mPmMfmryIz hjSeSnGeOoKHQ== From: sashiko-bot@kernel.org Subject: Re: [PATCH v3 09/15] KVM: x86/tdp_mmu: Centrally propagate to-present/atomic zap updates to external PTEs Reply-To: sashiko-reviews@lists.linux.dev To: "Yan Zhao" Cc: kvm@vger.kernel.org In-Reply-To: <20260528081225.10348-1-yan.y.zhao@intel.com> References: <20260528081225.10348-1-yan.y.zhao@intel.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Thu, 28 May 2026 09:52:51 +0000 Message-Id: <20260528095251.669351F000E9@smtp.kernel.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 2 potential issue(= s) to consider: - [Critical] Mutually exclusive `else if` logic in `__handle_changed_spte()= ` prevents external page table updates for parent SPTEs, and incorrect orde= ring creates a use-after-free hazard on error rollback. - [High] Propagating deterministic errors from `set_external_spte()` causes= an infinite retry loop in callers of `tdp_mmu_set_spte_atomic()`. -- commit e8401f58d2841529854a98f404633950aa40f76c Author: Sean Christopherson KVM: x86/tdp_mmu: Centrally propagate to-present/atomic zap updates to exte= rnal PTEs This commit moves the propagation of to-present changes and atomic zap changes to external PTEs from __tdp_mmu_set_spte_atomic() to __handle_changed_spte(), centralizing SPTE change handling. Link: https://lore.kernel.org/lkml/aYYn0nf2cayYu8e7@google.com [1] Link: https://lore.kernel.org/kvm/aY_Dq3riRLUNFZfr@google.com [2] > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c > --- a/arch/x86/kvm/mmu/tdp_mmu.c > +++ b/arch/x86/kvm/mmu/tdp_mmu.c [ ... ] > @@ -578,29 +576,51 @@ static void handle_changed_spte(struct kvm *kvm, st= ruct kvm_mmu_page *sp, [ ... ] > /* > * Recursively handle child PTs if the change removed a subtree from > * the paging structure. Note the WARN on the PFN changing without the > * SPTE being converted to a hugepage (leaf) or being zapped. Shadow > * pages are kernel allocations and should never be migrated. > + * > + * For the mirror page table, propagate changes to present or changes of > + * leaf SPTEs to !present under shared mmu_lock to the external SPTE via > + * set_external_spte() op. > */ > if (was_present && !was_leaf && > - (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed))) > + (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed))) { > handle_removed_pt(kvm, spte_to_child_pt(old_spte, level), shared); > + } else if (is_mirror_sp(sp) && (is_present || shared)) { > + int r; > + > + r =3D kvm_x86_call(set_external_spte)(kvm, gfn, old_spte, new_spte, le= vel); > + if (r) > + return r; > + } [Severity: Critical] Does this mutually exclusive else if logic skip external page table updates for parent SPTEs?=20 When replacing a non-leaf mirror SPTE with a leaf SPTE (such as during a hugepage collapse) or performing a shared atomic zap on a non-leaf SPTE, the first condition is met and handle_removed_pt() is called. Because of the else if, the set_external_spte() call is skipped entirely, leaving the hardware external page table desynchronized. Additionally, if this is changed to an independent if block, could the order of operations create a use-after-free on error rollback? If handle_removed_pt() executes first, it irreversibly unlinks and schedules child page tables for RCU freeing. If set_external_spte() then runs and fails (e.g. TDX rejects it), __handle_changed_spte() returns an error. The caller tdp_mmu_set_spte_atomic() will attempt to roll back by restoring the parent SPTE to its old value. However, since the child page tables were already freed, the restored SPTE would point to RCU-freed memory. Should set_external_spte() be called before handle_removed_pt() to allow a safe rollback? > @@ -673,14 +668,51 @@ static inline int __must_check tdp_mmu_set_spte_ato= mic(struct kvm *kvm, [ ... ] > + ret =3D __handle_changed_spte(kvm, sp, iter->gfn, iter->old_spte, > + new_spte, iter->level, true); > + /* > + * Unfreeze the mirror SPTE. If updating the external SPTE failed, > + * restore the old value so that the mirror SPTE isn't frozen in > + * perpetuity, otherwise set the mirror SPTE to the new desired value. > + */ > + if (is_mirror_sptep(iter->sptep)) { > + if (ret) > + __kvm_tdp_mmu_write_spte(iter->sptep, iter->old_spte); > + else > + __kvm_tdp_mmu_write_spte(iter->sptep, new_spte); > + } else { > + /* > + * Bug the VM if handling the change failed, as failure is only > + * allowed if KVM couldn't update the external SPTE. > + */ > + KVM_BUG_ON(ret, kvm); > + } > + return ret; > } [Severity: High] Will propagating deterministic error codes from set_external_spte() cause infinite retry loops in callers of tdp_mmu_set_spte_atomic()? Callers of tdp_mmu_set_spte_atomic() generally assume that any non-zero return value is a transient cmpxchg race (which typically returns -EBUSY), and they implement an unconditional retry pattern on failure. If set_external_spte() rejects an operation and returns a deterministic error like -EINVAL (e.g. rejecting an atomic zap as mentioned in the commit message) or -ENOMEM, this error is propagated back. Because callers do not check the specific error code before retrying, won't this result in a kernel livelock where the caller infinitely loops attempting the same doomed operation? --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260528080856.1014= 1-1-yan.y.zhao@intel.com?part=3D9