From: Tejun Heo <tj@kernel.org>
To: Alexei Starovoitov <ast@kernel.org>,
David Hildenbrand <david@kernel.org>
Cc: David Vernet <void@manifault.com>,
Andrea Righi <arighi@nvidia.com>,
Changwoo Min <changwoo@igalia.com>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@linux.dev>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Thomas Gleixner <tglx@kernel.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Mike Rapoport <rppt@kernel.org>,
Emil Tsalapatis <emil@etsalapatis.com>,
sched-ext@lists.linux.dev, bpf@vger.kernel.org, x86@kernel.org,
linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/8] bpf: Recover arena kernel faults with scratch page
Date: Sun, 31 May 2026 07:47:58 -1000 [thread overview]
Message-ID: <8cc56c7a4aa29628b7d17d85be7eadb9@kernel.org> (raw)
In-Reply-To: <ahnedQ33ZH8ogjbC@slm.duckdns.org>
Hello,
I posted the check removal [1], and Sashiko's review flagged a
break-before-make problem with it [2] that I think is real.
The scratch page is a present PAGE_KERNEL mapping, so having
apply_range_set_cb() overwrite it via set_pte_at() during
bpf_arena_alloc_pages() is a valid->valid PFN change. I'm not familiar with
arm at all. David, my understanding is that's a break-before-make violation
on arm64, and that on any arch the stale TLB entry keeps resolving to the
shared scratch page until it's flushed, so a later access can hit scratch
instead of the new page. Is that what you were worried about?
So instead of just dropping the check, the install should route through an
invalid entry rather than overwrite in place:
while (!ptep_try_set(pte, mk_pte(page, PAGE_KERNEL))) {
old = ptep_get(pte);
if (pte_none(old))
continue;
if (WARN_ON_ONCE(pte_page(old) != arena->scratch_page))
return -EBUSY;
ptep_get_and_clear(&init_mm, addr, pte);
broke_scratch = true;
}
ptep_try_set() only fills a none slot, so the slot goes scratch->none->page
and never valid->valid, and the loop copes with a concurrent fault
re-scratching it. This also closes the set_pte_at()-vs-ptep_try_set() race
I raised earlier, since both sides are now cmpxchg. A broken scratch entry
was live, so the caller flush_tlb_kernel_range()s those pages when
broke_scratch is set, like arena_free_pages() already does after clearing.
[1] https://lore.kernel.org/r/20260531165852.555930-1-tj@kernel.org
[2] https://lore.kernel.org/r/20260531170854.31EA51F00893@smtp.kernel.org
Thanks.
--
tejun
next prev parent reply other threads:[~2026-05-31 17:48 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 17:22 [PATCHSET v4 sched_ext/for-7.2] bpf/arena: Direct kernel-side access Tejun Heo
2026-05-22 17:22 ` [PATCH 1/8] mm: Add ptep_try_set() for lockless empty-slot installs Tejun Heo
2026-05-22 22:07 ` David Hildenbrand (Arm)
2026-05-25 15:50 ` patchwork-bot+netdevbpf
2026-05-22 17:22 ` [PATCH 2/8] bpf: Recover arena kernel faults with scratch page Tejun Heo
2026-05-26 12:45 ` David Hildenbrand (Arm)
2026-05-28 21:30 ` Alexei Starovoitov
2026-05-29 18:12 ` Tejun Heo
2026-05-29 18:38 ` Alexei Starovoitov
2026-05-29 18:44 ` Tejun Heo
2026-05-31 17:47 ` Tejun Heo [this message]
2026-05-31 18:58 ` David Hildenbrand (Arm)
2026-05-22 17:22 ` [PATCH 3/8] bpf: Add sleepable variant of bpf_arena_alloc_pages for kernel callers Tejun Heo
2026-05-22 17:22 ` [PATCH 4/8] bpf: Add bpf_struct_ops_for_each_prog() Tejun Heo
2026-05-22 17:22 ` [PATCH 5/8] bpf/arena: Add bpf_arena_map_kern_vm_start() and bpf_prog_arena() Tejun Heo
2026-05-22 17:22 ` [PATCH 6/8] sched_ext: Require an arena for cid-form schedulers Tejun Heo
2026-05-22 17:22 ` [PATCH 7/8] sched_ext: Sub-allocator over kernel-claimed BPF arena pages Tejun Heo
2026-05-22 17:22 ` [PATCH 8/8] sched_ext: Convert ops.set_cmask() to arena-resident cmask Tejun Heo
2026-05-25 15:45 ` [PATCHSET v4 sched_ext/for-7.2] bpf/arena: Direct kernel-side access Alexei Starovoitov
2026-05-25 19:54 ` Tejun Heo
-- strict thread matches above, loose matches on Subject: below --
2026-05-20 23:50 [PATCHSET v3 " Tejun Heo
2026-05-20 23:50 ` [PATCH 2/8] bpf: Recover arena kernel faults with scratch page Tejun Heo
2026-05-21 3:16 ` Emil Tsalapatis
2026-05-21 9:42 ` Alexei Starovoitov
2026-05-21 17:39 ` Tejun Heo
2026-05-17 21:12 [PATCHSET v2 sched_ext/for-7.2] bpf/arena: Direct kernel-side access Tejun Heo
2026-05-17 21:12 ` [PATCH 2/8] bpf: Recover arena kernel faults with scratch page Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8cc56c7a4aa29628b7d17d85be7eadb9@kernel.org \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=arighi@nvidia.com \
--cc=ast@kernel.org \
--cc=bp@alien8.de \
--cc=bpf@vger.kernel.org \
--cc=catalin.marinas@arm.com \
--cc=changwoo@igalia.com \
--cc=daniel@iogearbox.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@kernel.org \
--cc=emil@etsalapatis.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=martin.lau@linux.dev \
--cc=memxor@gmail.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rppt@kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=tglx@kernel.org \
--cc=void@manifault.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox