From: Puranjay Mohan <puranjay@kernel.org>
To: bpf@vger.kernel.org
Cc: Puranjay Mohan <puranjay@kernel.org>,
Puranjay Mohan <puranjay12@gmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>,
kernel-team@meta.com
Subject: [PATCH bpf v5 0/8] Introduce KF_FORBID_FAULT modifier for acquire/release kfuncs
Date: Thu, 26 Feb 2026 08:14:49 -0800 [thread overview]
Message-ID: <20260226161500.775715-1-puranjay@kernel.org> (raw)
Changelog:
v4: https://lore.kernel.org/all/20260224212535.1165151-1-puranjay@kernel.org/
Changes in v4 -> v5:
- Base the commits over bpf/master and not bpf-next/master
- Rename KF_FORBID_SLEEP to KF_FORBID_FAULT: mmap_lock is a sleeping
lock (rw_semaphore), so the actual constraint is about faulting (which
would deadlock on mmap_lock re-acquisition), not sleeping (Alexei)
- Rename forbid_sleep/forbid_sleep_count to forbid_fault/forbid_fault_count
throughout verifier and headers
- Change verifier error description from "nosleep region" to
"nofault region"
- Add new preparatory patch (patch 4) to consolidate the open-coded
sleepable check in check_func_call() into in_sleepable_context() +
non_sleepable_context_description(), consistent with
check_helper_call() and check_kfunc_call()
- Update selftest error messages to match: "nofault region" and use
#{{[0-9]+}} regex for helper IDs
v3: https://lore.kernel.org/all/20260223174659.2749964-1-puranjay@kernel.org/
Changes in v3 -> v4:
- In iter_release_acquired_ref(), drop the if (!err) guard before
zeroing iter_st->id, the verifier stops on error anyway (Eduard)
- In process_iter_next_call() DRAINED branch, guard iter_release_acquired_ref()
with is_kfunc_acquire() check for future-proofing (Eduard)
- In check_kfunc_call() KF_RELEASE path, validate that the stack slot is
actually STACK_ITER before operating on it (Mykyta)
- In check_kfunc_call() KF_RELEASE path, use iter_release_acquired_ref()
helper instead of open-coding release_reference() + id zeroing (Eduard)
- In check_kfunc_call() KF_ACQUIRE path, move is_iter_next_kfunc() check
inside the is_kfunc_acquire() block so there is only one place checking
is_kfunc_acquire() (Mykyta)
- Add new patch to consolidate scattered sleepable checks in
check_kfunc_call() into a single in_sleepable_context() check (Eduard)
- Drop separate forbid_sleep_count check in check_kfunc_call(), now
covered by the consolidated in_sleepable_context() check (Eduard)
- Use in_sleepable_context() for global subprog sleep check in
check_func_call() instead of open-coding (Eduard)
- Add runtime test for nested task_vma iterators on the same task to
verify mmap_read_trylock() handles concurrent readers (Alexei)
v2: https://lore.kernel.org/all/20260223160300.2109907-1-puranjay@kernel.org/
Changes in v2 -> v3:
- Rebased on bpf-next/master
v1: https://lore.kernel.org/all/20260218182555.1501495-1-puranjay@kernel.org/
Changes in v1 -> v2:
- Add a patch to consolidate sleepable context error message printing in
check_helper_call(), has no functional changes (Eduard)
- In check_kfunc_call() for KF_RELEASE and __iter arg, use release_regno
like dynptr rather than custom handling (Mykyta)
- Fix some comments to follow correct style (Mykyta)
- Move state->forbid_sleep_count-- to release_reference_state() (Eduard)
- Remove error message in check_resource_leak() for forbid_sleep_count
because it is redundant and will never trigger (Eduard)
- Consolidated some checks and prints (Eduard)
Some BPF kfuncs acquire resources that prevent faulting - a lock, a
reference to an object under a lock, a preempt-disable section. Today there
is no way for the verifier to know that holding a particular acquired
reference means faulting is forbidden. Programs either run entirely in
sleepable or non-sleepable context, with no way to express "faulting is
forbidden right now, but will be allowed once I release this reference."
This series adds KF_FORBID_FAULT, a new kfunc flag that can be combined
with KF_ACQUIRE. When a kfunc annotated with KF_ACQUIRE | KF_FORBID_FAULT
is called, the verifier tags the acquired reference with forbid_fault and
increments a per-state forbid_fault_count counter. When the reference is
released (through corresponding KF_RELEASE kfunc), the counter is
decremented. The verifier checks this counter everywhere it decides
whether sleeping is allowed — the implementation conservatively blocks
all sleepable operations while faulting is forbidden.
This is fully generic. Any pair of KF_ACQUIRE/KF_RELEASE kfuncs can opt
into fault prohibition by adding KF_FORBID_FAULT to the acquire side.
To make this useful for iterators specifically, the series first extends
the verifier's iterator support to allow KF_ACQUIRE on _next and KF_RELEASE
on a separate kfunc taking an __iter argument. The verifier tracks the
acquired reference on the iterator's stack slot (st->id) and auto-releases
it on the next _next call and on the DRAINED (NULL) path, so the
acquire/release is transparent to programs that don't need mid-loop
release.
Iterator KF_ACQUIRE support is not useful on its own right now, but it
becomes the foundation for KF_FORBID_FAULT: an iterator whose _next is
annotated with KF_ACQUIRE | KF_FORBID_FAULT can now express "holding this
pointer forbids faulting; calling _release invalidates the pointer and
re-enables sleeping."
The task_vma iterator is the first user. It holds mmap_lock during
iteration, preventing sleepable helpers like bpf_copy_from_user().
Since mmap_lock is a sleeping lock (rw_semaphore), sleeping itself is
fine while holding it. The actual danger is faulting, which would try
to re-acquire mmap_lock and deadlock.
With this series, a BPF program can release the lock mid-iteration:
bpf_for_each(task_vma, vma, task, 0) {
u64 start = vma->vm_start;
/* faulting forbidden, but vma pointer access allowed */
bpf_iter_task_vma_release(&___it);
/* mmap_lock released, vma pointer invalidated */
/* faulting (and sleeping) is fine here */
bpf_copy_from_user(&buf, sizeof(buf), (void *)start);
}
The series is organized as:
Patch 1: KF_ACQUIRE/KF_RELEASE plumbing for iterators in the
verifier. Pure infrastructure, no behavioral change to
existing iterators.
Patch 2: Consolidate sleepable context error message printing
in check_helper_call(), no functional changes.
Patch 3: Consolidate scattered sleepable checks in check_kfunc_call()
into a single in_sleepable_context() check.
Patch 4: Consolidate the open-coded sleepable check in
check_func_call() into in_sleepable_context(), consistent
with patches 2 and 3.
Patch 5: KF_FORBID_FAULT flag and forbid_fault_count machinery.
Generic, works for any KF_ACQUIRE kfunc - iterator or not.
Patch 6: Move mmap_lock acquisition from _new to _next in the
task_vma iterator, preparing for re-acquisition after
release.
Patch 7: Wire up task_vma as the first user: annotate _next with
KF_ACQUIRE | KF_FORBID_FAULT, add bpf_iter_task_vma_release().
Patch 8: Selftests covering the runtime path (release + copy_from_user,
nested iterators on same mm) and verifier rejection of
invalid patterns (sleeping without release, VMA access after
release, double release, release without acquire, nested
iterator interaction).
Puranjay Mohan (8):
bpf: Add KF_ACQUIRE and KF_RELEASE support for iterators
bpf: consolidate sleepable checks in check_helper_call()
bpf: consolidate sleepable checks in check_kfunc_call()
bpf: consolidate sleepable checks in check_func_call()
bpf: Add KF_FORBID_FAULT modifier for KF_ACQUIRE kfuncs
bpf: Move locking to bpf_iter_task_vma_next()
bpf: Add split iteration support to task_vma iterator
selftests/bpf: Add tests for split task_vma iterator
include/linux/bpf_verifier.h | 2 +
include/linux/btf.h | 1 +
kernel/bpf/helpers.c | 3 +-
kernel/bpf/task_iter.c | 44 +++-
kernel/bpf/verifier.c | 196 ++++++++++++------
.../testing/selftests/bpf/bpf_experimental.h | 1 +
.../testing/selftests/bpf/prog_tests/iters.c | 24 +++
.../selftests/bpf/prog_tests/summarization.c | 2 +-
tools/testing/selftests/bpf/progs/irq.c | 4 +-
.../selftests/bpf/progs/iters_task_vma.c | 71 +++++++
.../bpf/progs/iters_task_vma_nosleep.c | 125 +++++++++++
.../selftests/bpf/progs/preempt_lock.c | 6 +-
.../bpf/progs/verifier_async_cb_context.c | 4 +-
13 files changed, 399 insertions(+), 84 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/iters_task_vma_nosleep.c
base-commit: 8feedae96f872f1b74ad40c72b5cd6a47c44d9dd
--
2.47.3
next reply other threads:[~2026-02-26 16:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-26 16:14 Puranjay Mohan [this message]
2026-02-26 16:14 ` [PATCH bpf v5 1/8] bpf: Add KF_ACQUIRE and KF_RELEASE support for iterators Puranjay Mohan
2026-02-27 0:46 ` Alexei Starovoitov
2026-02-26 16:14 ` [PATCH bpf v5 2/8] bpf: consolidate sleepable checks in check_helper_call() Puranjay Mohan
2026-02-26 18:36 ` Eduard Zingerman
2026-02-26 16:14 ` [PATCH bpf v5 3/8] bpf: consolidate sleepable checks in check_kfunc_call() Puranjay Mohan
2026-02-26 16:14 ` [PATCH bpf v5 4/8] bpf: consolidate sleepable checks in check_func_call() Puranjay Mohan
2026-02-26 19:00 ` Eduard Zingerman
2026-02-26 16:14 ` [PATCH bpf v5 5/8] bpf: Add KF_FORBID_FAULT modifier for KF_ACQUIRE kfuncs Puranjay Mohan
2026-02-26 16:14 ` [PATCH bpf v5 6/8] bpf: Move locking to bpf_iter_task_vma_next() Puranjay Mohan
2026-02-26 16:14 ` [PATCH bpf v5 7/8] bpf: Add split iteration support to task_vma iterator Puranjay Mohan
2026-02-26 16:14 ` [PATCH bpf v5 8/8] selftests/bpf: Add tests for split " Puranjay Mohan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260226161500.775715-1-puranjay@kernel.org \
--to=puranjay@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=kernel-team@meta.com \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=mykyta.yatsenko5@gmail.com \
--cc=puranjay12@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox