From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6EFBE339853 for ; Mon, 23 Feb 2026 17:48:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771868907; cv=none; b=YdboefYf7o7MOXA+28EYpzZxWKw9MFQk0C7lFn3Yr8cqIb8PFocdWKeP9pUBNVnYiBGrYm+iweh35aqJIPhkzko0v+SHStVufZy3w9h2KAQD7CNPl+/+6G43qWXmbROKs//saUYIgbrxOUD4m8Yu1F99zQUiXr/dm29Nxhz0Md0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771868907; c=relaxed/simple; bh=pRx6mJhfR9LF+2GNLTdzRoYvyt1mDKk85qfbZNpdrPE=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=YIs9AUyvwpyO1PJNnsYfxvaRbwBrSvrI0E88NfOt1EUG49aANbA20AMZKKy24c346jeSobKYw3+FMYRq9QvZZNX3rg7zbDOXmgOOT7pKmzVwU++be1Rqx08GYo865JJSpl1RZdtMKMA9KgKdSc02jL5K7pQRNLI+yKXWisj0HK8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ug9Ox0nC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ug9Ox0nC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B42AEC116C6; Mon, 23 Feb 2026 17:48:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1771868907; bh=pRx6mJhfR9LF+2GNLTdzRoYvyt1mDKk85qfbZNpdrPE=; h=From:To:Cc:Subject:Date:From; b=ug9Ox0nCzLI29zh4vtWeKl9eMwNX5axkKZi5t6U7mwB6DZ6oGOPZ6vWxOqWx22OA/ pQQ68hTWODmVp2mH3DucGZrVpgq8ldlJ/lcXg7xvru5xgOM/mlU/mR+aTSe8id5y/d 6Cwb6opp5SbeeuIyRAePgR0PJfRsi4OKTwWN6m/z+2KXkCJC1a18lhLSH9ubkmTUrF vH4FtsTA1GMm8zwF+vWt9eXuGHlauAoMiSgvtn88hvgAg56dahxF8FeRvSXMQ4SVX2 5HO3+o6sYs4yBVHUqFSrZOqyN/v5TU5VT/hHgdXMpq1XLULvVypnuUO3DDQAyQOVas WnVQwfzB4sT9Q== From: Puranjay Mohan To: bpf@vger.kernel.org Cc: Puranjay Mohan , Puranjay Mohan , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , Kumar Kartikeya Dwivedi , Mykyta Yatsenko , kernel-team@meta.com Subject: [PATCH bpf-next v3 0/6] Introduce KF_FORBID_SLEEP modifier for acquire/release kfuncs Date: Mon, 23 Feb 2026 09:46:50 -0800 Message-ID: <20260223174659.2749964-1-puranjay@kernel.org> X-Mailer: git-send-email 2.47.3 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Changelog: v2: https://lore.kernel.org/all/20260223160300.2109907-1-puranjay@kernel.org/ Changes in v2 -> v3: - Rebased on bpf-next/master v1: https://lore.kernel.org/all/20260218182555.1501495-1-puranjay@kernel.org/ Changes in v1 -> v2: - Add a patch to consolidate sleepable context error message printing in check_helper_call(), has no functional changes (Eduard) - In check_kfunc_call() for KF_RELEASE and __iter arg, use release_regno like dynptr rather than custom handling (Mykyta) - Fix some comments to follow correct style (Mykyta) - Move state->forbid_sleep_count-- to release_reference_state() (Eduard) - Remove error message in check_resource_leak() for forbid_sleep_count because it is redundant and will never trigger (Eduard) - Consolidated some checks and prints (Eduard) Some BPF kfuncs acquire resources that prevent sleeping - a lock, a reference to an object under a lock, a preempt-disable section. Today there is no way for the verifier to know that holding a particular acquired reference means sleeping is forbidden. Programs either run entirely in sleepable or non-sleepable context, with no way to express "sleeping is forbidden right now, but will be allowed once I release this reference." This series adds KF_FORBID_SLEEP, a new kfunc flag that can be combined with KF_ACQUIRE. When a kfunc annotated with KF_ACQUIRE | KF_FORBID_SLEEP is called, the verifier tags the acquired reference with forbid_sleep and increments a per-state forbid_sleep_count counter. When the reference is released (through corresponding KF_RELEASE kfunc), the counter is decremented. The verifier checks this counter everywhere it decides whether sleeping is allowed — sleepable helpers, sleepable kfuncs, global function calls, and resource leak checks. This is fully generic. Any pair of KF_ACQUIRE/KF_RELEASE kfuncs can opt into sleep prohibition by adding KF_FORBID_SLEEP to the acquire side. To make this useful for iterators specifically, the series first extends the verifier's iterator support to allow KF_ACQUIRE on _next and KF_RELEASE on a separate kfunc taking an __iter argument. The verifier tracks the acquired reference on the iterator's stack slot (st->id) and auto-releases it on the next _next call and on the DRAINED (NULL) path, so the acquire/release is transparent to programs that don't need mid-loop release. Iterator KF_ACQUIRE support is not useful on its own right now, but it becomes the foundation for KF_FORBID_SLEEP: an iterator whose _next is annotated with KF_ACQUIRE | KF_FORBID_SLEEP can now express "holding this pointer forbids sleeping; calling _release invalidates the pointer and re-enables sleeping." The task_vma iterator is the first user. It holds mmap_lock during iteration, preventing sleepable helpers like bpf_copy_from_user(). Currently, this can lead to a deadlock as the fault path also takes mmap_lock. With this series, a BPF program can release the lock mid-iteration: bpf_for_each(task_vma, vma, task, 0) { u64 start = vma->vm_start; /* sleeping forbidden, but vma pointer access allowed */ bpf_iter_task_vma_release(&___it); /* mmap_lock released, vma pointer invalidated */ /* sleeping is fine here */ bpf_copy_from_user(&buf, sizeof(buf), (void *)start); } The series is organized as: Patch 1: KF_ACQUIRE/KF_RELEASE plumbing for iterators in the verifier. Pure infrastructure, no behavioral change to existing iterators. Patch 2: Consolidate sleepable context error message printing in check_helper_call(), no functional changes Patch 3: KF_FORBID_SLEEP flag and forbid_sleep_count machinery. Generic, works for any KF_ACQUIRE kfunc - iterator or not. Patch 4: Move mmap_lock acquisition from _new to _next in the task_vma iterator, preparing for re-acquisition after release. Patch 5: Wire up task_vma as the first user: annotate _next with KF_ACQUIRE | KF_FORBID_SLEEP, add bpf_iter_task_vma_release(). Patch 6: Selftests covering the runtime path (release + copy_from_user) and verifier rejection of invalid patterns (sleeping without release, VMA access after release, double release, release without acquire, nested iterator interaction). Puranjay Mohan (6): bpf: Add KF_ACQUIRE and KF_RELEASE support for iterators bpf: consolidate sleepable context error message printing bpf: Add KF_FORBID_SLEEP modifier for KF_ACQUIRE kfuncs bpf: Move locking to bpf_iter_task_vma_next() bpf: Add split iteration support to task_vma iterator selftests/bpf: Add tests for split task_vma iterator include/linux/bpf_verifier.h | 2 + include/linux/btf.h | 1 + kernel/bpf/btf.c | 3 + kernel/bpf/helpers.c | 3 +- kernel/bpf/task_iter.c | 44 ++++-- kernel/bpf/verifier.c | 147 +++++++++++++----- .../testing/selftests/bpf/bpf_experimental.h | 1 + .../testing/selftests/bpf/prog_tests/iters.c | 13 ++ .../selftests/bpf/progs/iters_task_vma.c | 39 +++++ .../bpf/progs/iters_task_vma_nosleep.c | 125 +++++++++++++++ 10 files changed, 331 insertions(+), 47 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/iters_task_vma_nosleep.c base-commit: 3ecf0b4a0e0ed4783aa32c5f3e42d23c7021e1c8 -- 2.47.3