From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 23233CD4F5E for ; Wed, 20 May 2026 23:51:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=56EnKKPGuHgOKsaFa+NBa3ItADAq8xQmgeqKeYpg1Wk=; b=Fx84hAKSGlq2E95AnKAENo/yiN NLTETJv7DKNH7TRf+smfxqHP2uwVzQrDumhQvLDdCjDPaHmpZd84Gk1kp+prIZD3N6l/CRXflJ6Jq 7ZJpSCv8idNfAFNjdaTNcMcA3wOSBuz2/ZxhNXMBGNat4sVe2wWaV4lDxATg6WTMtUoJpXC9U/y2U RR+FQ3vYjC0VhHgkwqq7WqQW9CNQ+Xu3Be3acOCEVlt3IaWWzlr0B2Eh4dCjwLawdG8XB/fYJnW1n T3eOOjHANGxD2Rz3U9LWqmSKNcP1EYr37qNI5+2nLA+cfuBqaEgTX7JIJeq69lAXJJy7yt7h3NyLb V4M6NJPw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.99.1 #2 (Red Hat Linux)) id 1wPqgi-00000006Cd6-3j45; Wed, 20 May 2026 23:50:56 +0000 Received: from tor.source.kernel.org ([172.105.4.254]) by bombadil.infradead.org with esmtps (Exim 4.99.1 #2 (Red Hat Linux)) id 1wPqgh-00000006CcY-3tdQ for linux-arm-kernel@lists.infradead.org; Wed, 20 May 2026 23:50:56 +0000 Received: from smtp.kernel.org (quasi.space.kernel.org [100.103.45.18]) by tor.source.kernel.org (Postfix) with ESMTP id 03D17601E8; Wed, 20 May 2026 23:50:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 89A711F00A39; Wed, 20 May 2026 23:50:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1779321054; bh=56EnKKPGuHgOKsaFa+NBa3ItADAq8xQmgeqKeYpg1Wk=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=jedo725tD9w9fyUlX7zQezS2ZSB3n+ZskPz+3Mb+FqMvFd/PIMwjl0If6Ax77QebN ZT1wI3EVdktpOXzIPN5XXXl2zjXZ8Tau6DCWzkuGjPVby/qTofpt6Uu3rb7GdgU9x2 cmk51IRwSCvfi4soFAHWNCEqOqrAWcTlmzBUuY/wnuQN1vhABrYEAtApAjj19H/exp 4/ns2SP3KAnc+hYhP41YXcSXpMjCk8lpLcs8t1YSJivJkjtiqwrBlhRvNEnGMMKXyN +3pF3iJn7jj6jt+fwnlzjY9U6N+pDznUJfr7K4M16E6a0Kud7AjC7jM9s+c3T6PweM agyfd4ziUn0Xw== From: Tejun Heo To: David Vernet , Andrea Righi , Changwoo Min , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Kumar Kartikeya Dwivedi Cc: Peter Zijlstra , Catalin Marinas , Will Deacon , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Andrew Morton , David Hildenbrand , Mike Rapoport , Emil Tsalapatis , sched-ext@lists.linux.dev, bpf@vger.kernel.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Tejun Heo Subject: [PATCH 1/8] mm: Add ptep_try_set() for lockless empty-slot installs Date: Wed, 20 May 2026 13:50:45 -1000 Message-ID: <20260520235052.4180316-2-tj@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260520235052.4180316-1-tj@kernel.org> References: <20260520235052.4180316-1-tj@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Add ptep_try_set(ptep, new_pte): atomically set *ptep to new_pte iff it is currently pte_none(). Returns true on success, false if the slot was already populated or the arch has no implementation. The intended caller is the upcoming bpf_arena kernel-side fault recovery path. The install runs from a page fault that can be nested under locks held by the faulting kernel caller (e.g. a BPF program holding raw_res_spin_lock_irqsave on its arena's spinlock), so trylock-and-retry would A-A deadlock. Lock-free cmpxchg is the only viable option, which constrains this helper to special kernel page tables where concurrent writers cooperate via atomic accessors. The generic version in returns false. x86 and arm64 override with try_cmpxchg-based implementations on the underlying pteval. Other architectures get the false stub - the callers there already fall through to oops. v2: Rename to ptep_try_set(). Tighten kerneldoc for kernel-PTE use. (David, Alexei) Suggested-by: Kumar Kartikeya Dwivedi Suggested-by: Alexei Starovoitov Signed-off-by: Tejun Heo Cc: David Hildenbrand --- arch/arm64/include/asm/pgtable.h | 8 ++++++++ arch/x86/include/asm/pgtable.h | 8 ++++++++ include/linux/pgtable.h | 26 ++++++++++++++++++++++++++ 3 files changed, 42 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 9029b81ccbe8..a129be91ef2c 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1830,6 +1830,14 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, return __ptep_get_and_clear(mm, addr, ptep); } +static inline bool ptep_try_set(pte_t *ptep, pte_t new_pte) +{ + pteval_t old = 0; + + return try_cmpxchg(&pte_val(*ptep), &old, pte_val(new_pte)); +} +#define ptep_try_set ptep_try_set + #define test_and_clear_young_ptes test_and_clear_young_ptes static inline bool test_and_clear_young_ptes(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, unsigned int nr) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 13e3e9a054cb..047e273a4eab 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1284,6 +1284,14 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, } while (!try_cmpxchg((long *)&ptep->pte, (long *)&old_pte, *(long *)&new_pte)); } +static inline bool ptep_try_set(pte_t *ptep, pte_t new_pte) +{ + pte_t old_pte = __pte(0); + + return try_cmpxchg((long *)&ptep->pte, (long *)&old_pte, *(long *)&new_pte); +} +#define ptep_try_set ptep_try_set + #define flush_tlb_fix_spurious_fault(vma, address, ptep) do { } while (0) #define __HAVE_ARCH_PMDP_SET_ACCESS_FLAGS diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index cdd68ed3ae1a..d68374f404c1 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1036,6 +1036,32 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres } #endif +#ifndef ptep_try_set +/** + * ptep_try_set - atomically set an empty kernel PTE + * @ptep: page table entry + * @new_pte: value to install + * + * Atomically set *@ptep to @new_pte iff *@ptep is pte_none(). Return + * true on success, false if the slot was already populated or the + * arch has no implementation. + * + * For special kernel page tables only - never user page tables. The + * caller must prevent concurrent teardown of @ptep and must accept + * that other writers may race. Concurrent clearers must use + * ptep_get_and_clear() so racing accesses agree on the outcome. + * + * Architectures opt in by providing a cmpxchg-based override and + * defining ptep_try_set as an identity macro. The generic stub + * returns false, which is correct for callers that fall through to + * oops on failure. + */ +static inline bool ptep_try_set(pte_t *ptep, pte_t new_pte) +{ + return false; +} +#endif + #ifndef wrprotect_ptes /** * wrprotect_ptes - Write-protect PTEs that map consecutive pages of the same -- 2.54.0