public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs
@ 2026-02-25 16:23 Mykyta Yatsenko
  2026-02-25 16:23 ` [PATCH bpf-next v2 1/6] bpf: Reject sleepable raw_tp programs on non-faultable tracepoints Mykyta Yatsenko
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Mykyta Yatsenko @ 2026-02-25 16:23 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87; +Cc: Mykyta Yatsenko

This series adds support for sleepable BPF programs attached to raw
tracepoints (tp_btf). The motivation is to allow BPF programs on
syscall tracepoints to use sleepable helpers such as
bpf_copy_from_user(), enabling reliable user memory reads that can
page-fault.

Currently, raw tracepoint BPF programs always run with RCU read lock
held and preemption disabled, which prevents calling any helper that
might sleep. Faultable tracepoints (__DECLARE_TRACE_SYSCALL) already
run under rcu_tasks_trace protection in process context where sleeping
is safe.

This series removes that restriction for faultable tracepoints:

Patch 1 adds an attach-time check to reject sleepable programs on
non-faultable tracepoints (e.g., sched_switch) that may run in NMI
or other non-sleepable contexts. This is a no-op until the verifier
allows sleepable raw_tp programs.

Patch 2 modifies __bpf_trace_run() to support sleepable programs:
use migrate_disable() instead of rcu_read_lock() for sleepable
programs, call might_fault() to annotate faultable context, and rely
on the outer rcu_tasks_trace lock from the faultable tracepoint
callback for program lifetime protection.

Patch 3 removes preempt_disable from the faultable tracepoint BPF
callback wrapper, since preemption management is now handled
per-program inside __bpf_trace_run().

Patch 4 allows BPF_TRACE_RAW_TP programs to be loaded as sleepable.
All runtime infrastructure is in place at this point, ensuring no
bisectability issues.

Patch 5 adds the tp_btf.s section handler in libbpf, following the
existing pattern of fentry.s/fexit.s/lsm.s.

Patch 6 adds selftests covering both the positive case (sleepable
program on sys_enter using bpf_copy_from_user() to read user memory)
and the negative case (sleepable program rejected on sched_switch).

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
Changes in v2:
  - Address AI review points - modified the order of the patches
  - Link to v1: https://lore.kernel.org/bpf/20260218-sleepable_tracepoints-v1-0-ec2705497208@meta.com/

---
Mykyta Yatsenko (6):
      bpf: Reject sleepable raw_tp programs on non-faultable tracepoints
      bpf: Add sleepable execution path for raw tracepoint programs
      bpf: Remove preempt_disable from faultable tracepoint BPF callbacks
      bpf: Allow sleepable programs for BPF_TRACE_RAW_TP attach type
      libbpf: Add tp_btf.s section handler for sleepable raw tracepoints
      selftests/bpf: Add tests for sleepable raw tracepoint programs

 include/trace/bpf_probe.h                          |  2 -
 kernel/bpf/syscall.c                               |  5 ++
 kernel/bpf/verifier.c                              |  3 +-
 kernel/trace/bpf_trace.c                           | 20 ++++++--
 tools/lib/bpf/libbpf.c                             |  1 +
 .../selftests/bpf/prog_tests/sleepable_raw_tp.c    | 56 ++++++++++++++++++++++
 .../selftests/bpf/progs/test_sleepable_raw_tp.c    | 43 +++++++++++++++++
 .../bpf/progs/test_sleepable_raw_tp_fail.c         | 16 +++++++
 tools/testing/selftests/bpf/verifier/sleepable.c   |  5 +-
 9 files changed, 142 insertions(+), 9 deletions(-)
---
base-commit: f620af11c27b8ec9994a39fe968aa778112d1566
change-id: 20260216-sleepable_tracepoints-381ae1410550

Best regards,
-- 
Mykyta Yatsenko <yatsenko@meta.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 1/6] bpf: Reject sleepable raw_tp programs on non-faultable tracepoints
  2026-02-25 16:23 [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs Mykyta Yatsenko
@ 2026-02-25 16:23 ` Mykyta Yatsenko
  2026-03-06  3:59   ` Kumar Kartikeya Dwivedi
  2026-02-25 16:23 ` [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs Mykyta Yatsenko
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Mykyta Yatsenko @ 2026-02-25 16:23 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87; +Cc: Mykyta Yatsenko

From: Mykyta Yatsenko <yatsenko@meta.com>

Add an attach-time check in bpf_raw_tp_link_attach() to ensure that
sleepable BPF programs can only attach to faultable tracepoints.
Faultable tracepoints (e.g., sys_enter, sys_exit) are guaranteed to
run in a context where sleeping is safe, using rcu_tasks_trace for
protection. Non-faultable tracepoints may run in NMI or other
non-sleepable contexts.

This complements the verifier-side change that allows BPF_TRACE_RAW_TP
programs to be loaded as sleepable.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
 kernel/bpf/syscall.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0378e83b4099..6ddafb1b03fd 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4261,6 +4261,11 @@ static int bpf_raw_tp_link_attach(struct bpf_prog *prog,
 	if (!btp)
 		return -ENOENT;
 
+	if (prog->sleepable && !tracepoint_is_faultable(btp->tp)) {
+		err = -EINVAL;
+		goto out_put_btp;
+	}
+
 	link = kzalloc_obj(*link, GFP_USER);
 	if (!link) {
 		err = -ENOMEM;

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs
  2026-02-25 16:23 [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs Mykyta Yatsenko
  2026-02-25 16:23 ` [PATCH bpf-next v2 1/6] bpf: Reject sleepable raw_tp programs on non-faultable tracepoints Mykyta Yatsenko
@ 2026-02-25 16:23 ` Mykyta Yatsenko
  2026-02-25 17:12   ` bot+bpf-ci
  2026-02-25 16:23 ` [PATCH bpf-next v2 3/6] bpf: Remove preempt_disable from faultable tracepoint BPF callbacks Mykyta Yatsenko
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Mykyta Yatsenko @ 2026-02-25 16:23 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87; +Cc: Mykyta Yatsenko

From: Mykyta Yatsenko <yatsenko@meta.com>

Modify __bpf_trace_run() to support both sleepable and non-sleepable
BPF programs. When the program is sleepable:

- Skip cant_sleep() and instead call might_fault() to annotate
  the faultable context
- Use migrate_disable()/migrate_enable() instead of
  rcu_read_lock()/rcu_read_unlock() to allow sleeping while
  still protecting percpu data access
- The outer rcu_tasks_trace lock is already held by the faultable
  tracepoint callback (__DECLARE_TRACE_SYSCALL), providing lifetime
  protection for the BPF program

For non-sleepable programs, behavior is unchanged: cant_sleep() check,
rcu_read_lock() protection.

This allows multiple BPF programs with different sleepable settings
to coexist on the same faultable tracepoint, since __bpf_trace_run()
is invoked per-link.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
 kernel/trace/bpf_trace.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 9bc0dfd235af..2dd345e0fdb0 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2076,7 +2076,7 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
 	struct bpf_run_ctx *old_run_ctx;
 	struct bpf_trace_run_ctx run_ctx;
 
-	rcu_read_lock_dont_migrate();
+	migrate_disable();
 	if (unlikely(!bpf_prog_get_recursion_context(prog))) {
 		bpf_prog_inc_misses_counter(prog);
 		goto out;
@@ -2085,12 +2085,26 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
 	run_ctx.bpf_cookie = link->cookie;
 	old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
 
-	(void) bpf_prog_run(prog, args);
+	if (prog->sleepable) {
+		might_fault();
+		(void)bpf_prog_run(prog, args);
+	} else {
+		/*
+		 * Non-sleepable programs may run in the faultable context,
+		 * do cant_sleep() only if program is non-sleepable and context
+		 * is non-faultable.
+		 */
+		if (!link->link.sleepable)
+			cant_sleep();
+		rcu_read_lock();
+		(void)bpf_prog_run(prog, args);
+		rcu_read_unlock();
+	}
 
 	bpf_reset_run_ctx(old_run_ctx);
 out:
 	bpf_prog_put_recursion_context(prog);
-	rcu_read_unlock_migrate();
+	migrate_enable();
 }
 
 #define UNPACK(...)			__VA_ARGS__

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 3/6] bpf: Remove preempt_disable from faultable tracepoint BPF callbacks
  2026-02-25 16:23 [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs Mykyta Yatsenko
  2026-02-25 16:23 ` [PATCH bpf-next v2 1/6] bpf: Reject sleepable raw_tp programs on non-faultable tracepoints Mykyta Yatsenko
  2026-02-25 16:23 ` [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs Mykyta Yatsenko
@ 2026-02-25 16:23 ` Mykyta Yatsenko
  2026-03-06  4:25   ` Kumar Kartikeya Dwivedi
  2026-02-25 16:23 ` [PATCH bpf-next v2 4/6] bpf: Allow sleepable programs for BPF_TRACE_RAW_TP attach type Mykyta Yatsenko
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Mykyta Yatsenko @ 2026-02-25 16:23 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87; +Cc: Mykyta Yatsenko

From: Mykyta Yatsenko <yatsenko@meta.com>

Remove preempt_disable_notrace()/preempt_enable_notrace() from
__BPF_DECLARE_TRACE_SYSCALL, the BPF probe callback wrapper for
faultable (syscall) tracepoints.

The preemption management is now handled inside __bpf_trace_run()
on a per-program basis: migrate_disable() for sleepable programs,
rcu_read_lock() (which implies preempt-off in non-PREEMPT_RCU
configs) for non-sleepable programs. This allows sleepable BPF
programs to actually sleep when attached to faultable tracepoints.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
 include/trace/bpf_probe.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h
index 9391d54d3f12..d1de8f9aa07f 100644
--- a/include/trace/bpf_probe.h
+++ b/include/trace/bpf_probe.h
@@ -58,9 +58,7 @@ static notrace void							\
 __bpf_trace_##call(void *__data, proto)					\
 {									\
 	might_fault();							\
-	preempt_disable_notrace();					\
 	CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(__data, CAST_TO_U64(args));	\
-	preempt_enable_notrace();					\
 }
 
 #undef DECLARE_EVENT_SYSCALL_CLASS

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 4/6] bpf: Allow sleepable programs for BPF_TRACE_RAW_TP attach type
  2026-02-25 16:23 [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs Mykyta Yatsenko
                   ` (2 preceding siblings ...)
  2026-02-25 16:23 ` [PATCH bpf-next v2 3/6] bpf: Remove preempt_disable from faultable tracepoint BPF callbacks Mykyta Yatsenko
@ 2026-02-25 16:23 ` Mykyta Yatsenko
  2026-03-06  4:26   ` Kumar Kartikeya Dwivedi
  2026-02-25 16:23 ` [PATCH bpf-next v2 5/6] libbpf: Add tp_btf.s section handler for sleepable raw tracepoints Mykyta Yatsenko
  2026-02-25 16:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs Mykyta Yatsenko
  5 siblings, 1 reply; 19+ messages in thread
From: Mykyta Yatsenko @ 2026-02-25 16:23 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87; +Cc: Mykyta Yatsenko

From: Mykyta Yatsenko <yatsenko@meta.com>

Add BPF_TRACE_RAW_TP to the set of tracing program attach types
that can be loaded as sleepable in can_be_sleepable(). The actual
enforcement that the target tracepoint supports sleepable execution
(i.e., is faultable) is deferred to attach time, since the target
tracepoint is not known at program load time.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
 kernel/bpf/verifier.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1153a828ce8d..9ec80596ff1d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -25199,6 +25199,7 @@ static bool can_be_sleepable(struct bpf_prog *prog)
 		case BPF_MODIFY_RETURN:
 		case BPF_TRACE_ITER:
 		case BPF_TRACE_FSESSION:
+		case BPF_TRACE_RAW_TP:
 			return true;
 		default:
 			return false;
@@ -25228,7 +25229,7 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 	}
 
 	if (prog->sleepable && !can_be_sleepable(prog)) {
-		verbose(env, "Only fentry/fexit/fmod_ret, lsm, iter, uprobe, and struct_ops programs can be sleepable\n");
+		verbose(env, "Only fentry/fexit/fmod_ret, lsm, iter, uprobe, struct_ops, and raw_tp programs can be sleepable\n");
 		return -EINVAL;
 	}
 

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 5/6] libbpf: Add tp_btf.s section handler for sleepable raw tracepoints
  2026-02-25 16:23 [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs Mykyta Yatsenko
                   ` (3 preceding siblings ...)
  2026-02-25 16:23 ` [PATCH bpf-next v2 4/6] bpf: Allow sleepable programs for BPF_TRACE_RAW_TP attach type Mykyta Yatsenko
@ 2026-02-25 16:23 ` Mykyta Yatsenko
  2026-03-06  4:26   ` Kumar Kartikeya Dwivedi
  2026-02-25 16:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs Mykyta Yatsenko
  5 siblings, 1 reply; 19+ messages in thread
From: Mykyta Yatsenko @ 2026-02-25 16:23 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87; +Cc: Mykyta Yatsenko

From: Mykyta Yatsenko <yatsenko@meta.com>

Add SEC_DEF for "tp_btf.s+" section prefix, enabling userspace BPF
programs to use SEC("tp_btf.s/<tracepoint>") to load sleepable raw
tracepoint programs. This follows the existing pattern used for
fentry.s, fexit.s, fmod_ret.s, and lsm.s section definitions.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
 tools/lib/bpf/libbpf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 0be7017800fe..7109697b22b7 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -9863,6 +9863,7 @@ static const struct bpf_sec_def section_defs[] = {
 	SEC_DEF("raw_tracepoint.w+",	RAW_TRACEPOINT_WRITABLE, 0, SEC_NONE, attach_raw_tp),
 	SEC_DEF("raw_tp.w+",		RAW_TRACEPOINT_WRITABLE, 0, SEC_NONE, attach_raw_tp),
 	SEC_DEF("tp_btf+",		TRACING, BPF_TRACE_RAW_TP, SEC_ATTACH_BTF, attach_trace),
+	SEC_DEF("tp_btf.s+",		TRACING, BPF_TRACE_RAW_TP, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
 	SEC_DEF("fentry+",		TRACING, BPF_TRACE_FENTRY, SEC_ATTACH_BTF, attach_trace),
 	SEC_DEF("fmod_ret+",		TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF, attach_trace),
 	SEC_DEF("fexit+",		TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF, attach_trace),

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs
  2026-02-25 16:23 [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs Mykyta Yatsenko
                   ` (4 preceding siblings ...)
  2026-02-25 16:23 ` [PATCH bpf-next v2 5/6] libbpf: Add tp_btf.s section handler for sleepable raw tracepoints Mykyta Yatsenko
@ 2026-02-25 16:23 ` Mykyta Yatsenko
  2026-03-06  4:41   ` Kumar Kartikeya Dwivedi
  2026-03-09 21:11   ` Jiri Olsa
  5 siblings, 2 replies; 19+ messages in thread
From: Mykyta Yatsenko @ 2026-02-25 16:23 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87; +Cc: Mykyta Yatsenko

From: Mykyta Yatsenko <yatsenko@meta.com>

Add two subtests:
- success: Attach a sleepable BPF program to the faultable sys_enter
  tracepoint (tp_btf.s/sys_enter). Verify the program is triggered by
  a syscall.
- reject_non_faultable: Attempt to attach a sleepable BPF program to
  a non-faultable tracepoint (tp_btf.s/sched_switch). Verify that
  attachment is rejected.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
 .../selftests/bpf/prog_tests/sleepable_raw_tp.c    | 56 ++++++++++++++++++++++
 .../selftests/bpf/progs/test_sleepable_raw_tp.c    | 43 +++++++++++++++++
 .../bpf/progs/test_sleepable_raw_tp_fail.c         | 16 +++++++
 tools/testing/selftests/bpf/verifier/sleepable.c   |  5 +-
 4 files changed, 117 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c
new file mode 100644
index 000000000000..9b0ec7cc4cac
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include <time.h>
+#include "test_sleepable_raw_tp.skel.h"
+#include "test_sleepable_raw_tp_fail.skel.h"
+
+static void test_sleepable_raw_tp_success(void)
+{
+	struct test_sleepable_raw_tp *skel;
+	int err;
+
+	skel = test_sleepable_raw_tp__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel_open_load"))
+		return;
+
+	skel->bss->target_pid = getpid();
+
+	err = test_sleepable_raw_tp__attach(skel);
+	if (!ASSERT_OK(err, "skel_attach"))
+		goto cleanup;
+
+	syscall(__NR_nanosleep, &(struct timespec){ .tv_nsec = 555 }, NULL);
+
+	ASSERT_EQ(skel->bss->triggered, 1, "triggered");
+	ASSERT_EQ(skel->bss->err, 0, "err");
+	ASSERT_EQ(skel->bss->copied_tv_nsec, 555, "copied_tv_nsec");
+
+cleanup:
+	test_sleepable_raw_tp__destroy(skel);
+}
+
+static void test_sleepable_raw_tp_reject(void)
+{
+	struct test_sleepable_raw_tp_fail *skel;
+	int err;
+
+	skel = test_sleepable_raw_tp_fail__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel_open_load"))
+		goto cleanup;
+
+	err = test_sleepable_raw_tp_fail__attach(skel);
+	ASSERT_ERR(err, "skel_attach_should_fail");
+
+cleanup:
+	test_sleepable_raw_tp_fail__destroy(skel);
+}
+
+void test_sleepable_raw_tp(void)
+{
+	if (test__start_subtest("success"))
+		test_sleepable_raw_tp_success();
+	if (test__start_subtest("reject_non_faultable"))
+		test_sleepable_raw_tp_reject();
+}
diff --git a/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c
new file mode 100644
index 000000000000..ebacc766df57
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <asm/unistd.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
+#include <bpf/bpf_helpers.h>
+
+char _license[] SEC("license") = "GPL";
+
+int target_pid;
+int triggered;
+long err;
+long copied_tv_nsec;
+
+SEC("tp_btf.s/sys_enter")
+int BPF_PROG(test_sleepable_sys_enter, struct pt_regs *regs, long id)
+{
+	struct task_struct *task = bpf_get_current_task_btf();
+	struct __kernel_timespec *ts;
+	long tv_nsec;
+
+	if (task->pid != target_pid)
+		return 0;
+
+	if (id != __NR_nanosleep)
+		return 0;
+
+	ts = (void *)PT_REGS_PARM1_CORE_SYSCALL(regs);
+
+	/*
+	 * Use bpf_copy_from_user() - a sleepable helper - to read user memory.
+	 * This exercises the sleepable execution path of raw tracepoints.
+	 */
+	err = bpf_copy_from_user(&tv_nsec, sizeof(tv_nsec), &ts->tv_nsec);
+	if (err)
+		return err;
+
+	copied_tv_nsec = tv_nsec;
+	triggered = 1;
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c
new file mode 100644
index 000000000000..ef5dc3888df6
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+
+char _license[] SEC("license") = "GPL";
+
+/* Sleepable program on a non-faultable tracepoint should fail at attach */
+SEC("tp_btf.s/sched_switch")
+int BPF_PROG(test_sleepable_sched_switch, bool preempt,
+	     struct task_struct *prev, struct task_struct *next)
+{
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/verifier/sleepable.c b/tools/testing/selftests/bpf/verifier/sleepable.c
index 1f0d2bdc673f..39522b7cd317 100644
--- a/tools/testing/selftests/bpf/verifier/sleepable.c
+++ b/tools/testing/selftests/bpf/verifier/sleepable.c
@@ -76,7 +76,7 @@
 	.runs = -1,
 },
 {
-	"sleepable raw tracepoint reject",
+	"sleepable raw tracepoint accept",
 	.insns = {
 	BPF_MOV64_IMM(BPF_REG_0, 0),
 	BPF_EXIT_INSN(),
@@ -84,8 +84,7 @@
 	.prog_type = BPF_PROG_TYPE_TRACING,
 	.expected_attach_type = BPF_TRACE_RAW_TP,
 	.kfunc = "sched_switch",
-	.result = REJECT,
-	.errstr = "Only fentry/fexit/fmod_ret, lsm, iter, uprobe, and struct_ops programs can be sleepable",
+	.result = ACCEPT,
 	.flags = BPF_F_SLEEPABLE,
 	.runs = -1,
 },

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs
  2026-02-25 16:23 ` [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs Mykyta Yatsenko
@ 2026-02-25 17:12   ` bot+bpf-ci
  2026-03-06  4:23     ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 19+ messages in thread
From: bot+bpf-ci @ 2026-02-25 17:12 UTC (permalink / raw)
  To: mykyta.yatsenko5, bpf, ast, andrii, daniel, kafai, kernel-team,
	eddyz87
  Cc: yatsenko, ast, andrii, daniel, martin.lau, eddyz87, yonghong.song,
	clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 3036 bytes --]

> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9bc0dfd235af..2dd345e0fdb0 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c

[ ... ]

> @@ -2085,12 +2085,26 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
>  	run_ctx.bpf_cookie = link->cookie;
>  	old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
>
> -	(void) bpf_prog_run(prog, args);
> +	if (prog->sleepable) {
> +		might_fault();
> +		(void)bpf_prog_run(prog, args);

At this commit, __BPF_DECLARE_TRACE_SYSCALL in include/trace/bpf_probe.h
still wraps the BPF callback in preempt_disable_notrace() /
preempt_enable_notrace():

include/trace/bpf_probe.h:__BPF_DECLARE_TRACE_SYSCALL() {
    might_fault();
    preempt_disable_notrace();
    bpf_trace_run##x(__data, ...);
    preempt_enable_notrace();
}

Since __bpf_trace_run() is inlined into bpf_trace_run##x(), which is
called between the preempt_disable_notrace() and
preempt_enable_notrace(), the might_fault() here will execute with
preempt_count > 0, producing a warning on CONFIG_DEBUG_ATOMIC_SLEEP
kernels.

More importantly, if the sleepable program calls a helper like
bpf_copy_from_user(), the page fault handler will see in_atomic()
and return -EFAULT for any non-resident page instead of faulting it
in, defeating the purpose of sleepable programs.

The v1 RFC series included a separate patch to remove
preempt_disable_notrace() from __BPF_DECLARE_TRACE_SYSCALL.  Should
that change be folded into this commit so the sleepable path works
at each bisect point?

> +	} else {
> +		/*
> +		 * Non-sleepable programs may run in the faultable context,
> +		 * do cant_sleep() only if program is non-sleepable and context
> +		 * is non-faultable.
> +		 */
> +		if (!link->link.sleepable)
> +			cant_sleep();
> +		rcu_read_lock();
> +		(void)bpf_prog_run(prog, args);
> +		rcu_read_unlock();
> +	}

[ ... ]

The commit message states:

> For non-sleepable programs, behavior is unchanged: cant_sleep() check,
> rcu_read_lock() protection.

The parent commit does not have a cant_sleep() call in
__bpf_trace_run() -- it uses rcu_read_lock_dont_migrate().  The
conditional cant_sleep() is new in this commit.  Could this sentence
be updated to reflect the actual change?


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22406522761

AI-authorship-score: low
AI-authorship-explanation: Natural, domain-specific language with a small inaccuracy in the commit message that is more characteristic of human writing; code shows deep subsystem expertise.
issues-found: 2
issue-severity-score: medium
issue-severity-explanation: Sleepable BPF programs on faultable tracepoints cannot actually sleep because __BPF_DECLARE_TRACE_SYSCALL still wraps execution in preempt_disable_notrace; this defeats the purpose of the feature and produces debug warnings, but does not crash the system.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 1/6] bpf: Reject sleepable raw_tp programs on non-faultable tracepoints
  2026-02-25 16:23 ` [PATCH bpf-next v2 1/6] bpf: Reject sleepable raw_tp programs on non-faultable tracepoints Mykyta Yatsenko
@ 2026-03-06  3:59   ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 19+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-03-06  3:59 UTC (permalink / raw)
  To: Mykyta Yatsenko
  Cc: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87,
	Mykyta Yatsenko

On Wed, 25 Feb 2026 at 17:24, Mykyta Yatsenko
<mykyta.yatsenko5@gmail.com> wrote:
>
> From: Mykyta Yatsenko <yatsenko@meta.com>
>
> Add an attach-time check in bpf_raw_tp_link_attach() to ensure that
> sleepable BPF programs can only attach to faultable tracepoints.
> Faultable tracepoints (e.g., sys_enter, sys_exit) are guaranteed to
> run in a context where sleeping is safe, using rcu_tasks_trace for
> protection. Non-faultable tracepoints may run in NMI or other
> non-sleepable contexts.
>
> This complements the verifier-side change that allows BPF_TRACE_RAW_TP
> programs to be loaded as sleepable.
>
> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
> ---

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

> [...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs
  2026-02-25 17:12   ` bot+bpf-ci
@ 2026-03-06  4:23     ` Kumar Kartikeya Dwivedi
  2026-03-06  4:52       ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 19+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-03-06  4:23 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: mykyta.yatsenko5, bpf, ast, andrii, daniel, kafai, kernel-team,
	eddyz87, yatsenko, martin.lau, yonghong.song, clm, ihor.solodrai

On Wed, 25 Feb 2026 at 18:14, <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 9bc0dfd235af..2dd345e0fdb0 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
>
> [ ... ]
>
> > @@ -2085,12 +2085,26 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
> >       run_ctx.bpf_cookie = link->cookie;
> >       old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
> >
> > -     (void) bpf_prog_run(prog, args);
> > +     if (prog->sleepable) {
> > +             might_fault();
> > +             (void)bpf_prog_run(prog, args);
>
> At this commit, __BPF_DECLARE_TRACE_SYSCALL in include/trace/bpf_probe.h
> still wraps the BPF callback in preempt_disable_notrace() /
> preempt_enable_notrace():
>
> include/trace/bpf_probe.h:__BPF_DECLARE_TRACE_SYSCALL() {
>     might_fault();
>     preempt_disable_notrace();
>     bpf_trace_run##x(__data, ...);
>     preempt_enable_notrace();
> }
>
> Since __bpf_trace_run() is inlined into bpf_trace_run##x(), which is
> called between the preempt_disable_notrace() and
> preempt_enable_notrace(), the might_fault() here will execute with
> preempt_count > 0, producing a warning on CONFIG_DEBUG_ATOMIC_SLEEP
> kernels.
>
> More importantly, if the sleepable program calls a helper like
> bpf_copy_from_user(), the page fault handler will see in_atomic()
> and return -EFAULT for any non-resident page instead of faulting it
> in, defeating the purpose of sleepable programs.
>
> The v1 RFC series included a separate patch to remove
> preempt_disable_notrace() from __BPF_DECLARE_TRACE_SYSCALL.  Should
> that change be folded into this commit so the sleepable path works
> at each bisect point?

But sleepable programs are not yet permitted, so I think this is not
needed for bisection-friendliness.

>
> > +     } else {
> > +             /*
> > +              * Non-sleepable programs may run in the faultable context,
> > +              * do cant_sleep() only if program is non-sleepable and context
> > +              * is non-faultable.
> > +              */
> > +             if (!link->link.sleepable)
> > +                     cant_sleep();
> > +             rcu_read_lock();
> > +             (void)bpf_prog_run(prog, args);
> > +             rcu_read_unlock();
> > +     }
>
> [ ... ]
>
> The commit message states:
>
> > For non-sleepable programs, behavior is unchanged: cant_sleep() check,
> > rcu_read_lock() protection.
>
> The parent commit does not have a cant_sleep() call in
> __bpf_trace_run() -- it uses rcu_read_lock_dont_migrate().  The
> conditional cant_sleep() is new in this commit.  Could this sentence
> be updated to reflect the actual change?

This seems worth adjusting.

>
>

Other than that LGTM.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

> [...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: Remove preempt_disable from faultable tracepoint BPF callbacks
  2026-02-25 16:23 ` [PATCH bpf-next v2 3/6] bpf: Remove preempt_disable from faultable tracepoint BPF callbacks Mykyta Yatsenko
@ 2026-03-06  4:25   ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 19+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-03-06  4:25 UTC (permalink / raw)
  To: Mykyta Yatsenko
  Cc: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87,
	Mykyta Yatsenko

On Wed, 25 Feb 2026 at 17:25, Mykyta Yatsenko
<mykyta.yatsenko5@gmail.com> wrote:
>
> From: Mykyta Yatsenko <yatsenko@meta.com>
>
> Remove preempt_disable_notrace()/preempt_enable_notrace() from
> __BPF_DECLARE_TRACE_SYSCALL, the BPF probe callback wrapper for
> faultable (syscall) tracepoints.
>
> The preemption management is now handled inside __bpf_trace_run()
> on a per-program basis: migrate_disable() for sleepable programs,
> rcu_read_lock() (which implies preempt-off in non-PREEMPT_RCU
> configs) for non-sleepable programs. This allows sleepable BPF
> programs to actually sleep when attached to faultable tracepoints.
>
> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
> ---

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

> [...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 4/6] bpf: Allow sleepable programs for BPF_TRACE_RAW_TP attach type
  2026-02-25 16:23 ` [PATCH bpf-next v2 4/6] bpf: Allow sleepable programs for BPF_TRACE_RAW_TP attach type Mykyta Yatsenko
@ 2026-03-06  4:26   ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 19+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-03-06  4:26 UTC (permalink / raw)
  To: Mykyta Yatsenko
  Cc: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87,
	Mykyta Yatsenko

On Wed, 25 Feb 2026 at 17:25, Mykyta Yatsenko
<mykyta.yatsenko5@gmail.com> wrote:
>
> From: Mykyta Yatsenko <yatsenko@meta.com>
>
> Add BPF_TRACE_RAW_TP to the set of tracing program attach types
> that can be loaded as sleepable in can_be_sleepable(). The actual
> enforcement that the target tracepoint supports sleepable execution
> (i.e., is faultable) is deferred to attach time, since the target
> tracepoint is not known at program load time.
>
> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
> ---

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

> [...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 5/6] libbpf: Add tp_btf.s section handler for sleepable raw tracepoints
  2026-02-25 16:23 ` [PATCH bpf-next v2 5/6] libbpf: Add tp_btf.s section handler for sleepable raw tracepoints Mykyta Yatsenko
@ 2026-03-06  4:26   ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 19+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-03-06  4:26 UTC (permalink / raw)
  To: Mykyta Yatsenko
  Cc: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87,
	Mykyta Yatsenko

On Wed, 25 Feb 2026 at 17:29, Mykyta Yatsenko
<mykyta.yatsenko5@gmail.com> wrote:
>
> From: Mykyta Yatsenko <yatsenko@meta.com>
>
> Add SEC_DEF for "tp_btf.s+" section prefix, enabling userspace BPF
> programs to use SEC("tp_btf.s/<tracepoint>") to load sleepable raw
> tracepoint programs. This follows the existing pattern used for
> fentry.s, fexit.s, fmod_ret.s, and lsm.s section definitions.
>
> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
> ---

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

> [...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs
  2026-02-25 16:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs Mykyta Yatsenko
@ 2026-03-06  4:41   ` Kumar Kartikeya Dwivedi
  2026-03-06 23:56     ` Kumar Kartikeya Dwivedi
  2026-03-09 21:11   ` Jiri Olsa
  1 sibling, 1 reply; 19+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-03-06  4:41 UTC (permalink / raw)
  To: Mykyta Yatsenko
  Cc: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87,
	Mykyta Yatsenko

On Wed, 25 Feb 2026 at 17:25, Mykyta Yatsenko
<mykyta.yatsenko5@gmail.com> wrote:
>
> From: Mykyta Yatsenko <yatsenko@meta.com>
>
> Add two subtests:
> - success: Attach a sleepable BPF program to the faultable sys_enter
>   tracepoint (tp_btf.s/sys_enter). Verify the program is triggered by
>   a syscall.
> - reject_non_faultable: Attempt to attach a sleepable BPF program to
>   a non-faultable tracepoint (tp_btf.s/sched_switch). Verify that
>   attachment is rejected.
>
> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
> ---

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

... but see nit below.

>  .../selftests/bpf/prog_tests/sleepable_raw_tp.c    | 56 ++++++++++++++++++++++
>  .../selftests/bpf/progs/test_sleepable_raw_tp.c    | 43 +++++++++++++++++
>  .../bpf/progs/test_sleepable_raw_tp_fail.c         | 16 +++++++
>  tools/testing/selftests/bpf/verifier/sleepable.c   |  5 +-
>  4 files changed, 117 insertions(+), 3 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c
> new file mode 100644
> index 000000000000..9b0ec7cc4cac
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c
> @@ -0,0 +1,56 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
> +
> +#include <test_progs.h>
> +#include <time.h>
> +#include "test_sleepable_raw_tp.skel.h"
> +#include "test_sleepable_raw_tp_fail.skel.h"
> +
> +static void test_sleepable_raw_tp_success(void)
> +{
> +       struct test_sleepable_raw_tp *skel;
> +       int err;
> +
> +       skel = test_sleepable_raw_tp__open_and_load();
> +       if (!ASSERT_OK_PTR(skel, "skel_open_load"))
> +               return;
> +
> +       skel->bss->target_pid = getpid();

Shouldn't this be sys_gettid()? Otherwise the filter in prog only
works for the main thread (pid == tgid) in parallel test_progs mode.

> +
> +       err = test_sleepable_raw_tp__attach(skel);
> +       if (!ASSERT_OK(err, "skel_attach"))
> +               goto cleanup;
> +
> +       syscall(__NR_nanosleep, &(struct timespec){ .tv_nsec = 555 }, NULL);
> +
> +       ASSERT_EQ(skel->bss->triggered, 1, "triggered");
> +       ASSERT_EQ(skel->bss->err, 0, "err");
> +       ASSERT_EQ(skel->bss->copied_tv_nsec, 555, "copied_tv_nsec");
> +
> +cleanup:
> +       test_sleepable_raw_tp__destroy(skel);
> +}
> +
> +static void test_sleepable_raw_tp_reject(void)
> +{
> +       struct test_sleepable_raw_tp_fail *skel;
> +       int err;
> +
> +       skel = test_sleepable_raw_tp_fail__open_and_load();
> +       if (!ASSERT_OK_PTR(skel, "skel_open_load"))
> +               goto cleanup;
> +
> +       err = test_sleepable_raw_tp_fail__attach(skel);
> +       ASSERT_ERR(err, "skel_attach_should_fail");
> +
> +cleanup:
> +       test_sleepable_raw_tp_fail__destroy(skel);
> +}
> +
> +void test_sleepable_raw_tp(void)
> +{
> +       if (test__start_subtest("success"))
> +               test_sleepable_raw_tp_success();
> +       if (test__start_subtest("reject_non_faultable"))
> +               test_sleepable_raw_tp_reject();
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c
> new file mode 100644
> index 000000000000..ebacc766df57
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c
> @@ -0,0 +1,43 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
> +
> +#include <vmlinux.h>
> +#include <asm/unistd.h>
> +#include <bpf/bpf_tracing.h>
> +#include <bpf/bpf_core_read.h>
> +#include <bpf/bpf_helpers.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +int target_pid;
> +int triggered;
> +long err;
> +long copied_tv_nsec;
> +
> +SEC("tp_btf.s/sys_enter")
> +int BPF_PROG(test_sleepable_sys_enter, struct pt_regs *regs, long id)
> +{
> +       struct task_struct *task = bpf_get_current_task_btf();
> +       struct __kernel_timespec *ts;
> +       long tv_nsec;
> +
> +       if (task->pid != target_pid)
> +               return 0;
> +
> +       if (id != __NR_nanosleep)
> +               return 0;
> +
> +       ts = (void *)PT_REGS_PARM1_CORE_SYSCALL(regs);
> +
> +       /*
> +        * Use bpf_copy_from_user() - a sleepable helper - to read user memory.
> +        * This exercises the sleepable execution path of raw tracepoints.
> +        */
> +       err = bpf_copy_from_user(&tv_nsec, sizeof(tv_nsec), &ts->tv_nsec);
> +       if (err)
> +               return err;
> +
> +       copied_tv_nsec = tv_nsec;
> +       triggered = 1;
> +       return 0;
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c
> new file mode 100644
> index 000000000000..ef5dc3888df6
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c
> @@ -0,0 +1,16 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
> +
> +#include <vmlinux.h>
> +#include <bpf/bpf_tracing.h>
> +#include <bpf/bpf_helpers.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +/* Sleepable program on a non-faultable tracepoint should fail at attach */
> +SEC("tp_btf.s/sched_switch")
> +int BPF_PROG(test_sleepable_sched_switch, bool preempt,
> +            struct task_struct *prev, struct task_struct *next)
> +{
> +       return 0;
> +}
> diff --git a/tools/testing/selftests/bpf/verifier/sleepable.c b/tools/testing/selftests/bpf/verifier/sleepable.c
> index 1f0d2bdc673f..39522b7cd317 100644
> --- a/tools/testing/selftests/bpf/verifier/sleepable.c
> +++ b/tools/testing/selftests/bpf/verifier/sleepable.c
> @@ -76,7 +76,7 @@
>         .runs = -1,
>  },
>  {
> -       "sleepable raw tracepoint reject",
> +       "sleepable raw tracepoint accept",
>         .insns = {
>         BPF_MOV64_IMM(BPF_REG_0, 0),
>         BPF_EXIT_INSN(),
> @@ -84,8 +84,7 @@
>         .prog_type = BPF_PROG_TYPE_TRACING,
>         .expected_attach_type = BPF_TRACE_RAW_TP,
>         .kfunc = "sched_switch",
> -       .result = REJECT,
> -       .errstr = "Only fentry/fexit/fmod_ret, lsm, iter, uprobe, and struct_ops programs can be sleepable",
> +       .result = ACCEPT,
>         .flags = BPF_F_SLEEPABLE,
>         .runs = -1,
>  },
>
> --
> 2.47.3
>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs
  2026-03-06  4:23     ` Kumar Kartikeya Dwivedi
@ 2026-03-06  4:52       ` Kumar Kartikeya Dwivedi
  2026-03-10 17:38         ` Yonghong Song
  0 siblings, 1 reply; 19+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-03-06  4:52 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: mykyta.yatsenko5, bpf, ast, andrii, daniel, kafai, kernel-team,
	eddyz87, yatsenko, martin.lau, yonghong.song, clm, ihor.solodrai

On Fri, 6 Mar 2026 at 05:23, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>
> On Wed, 25 Feb 2026 at 18:14, <bot+bpf-ci@kernel.org> wrote:
> >
> > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > > index 9bc0dfd235af..2dd345e0fdb0 100644
> > > --- a/kernel/trace/bpf_trace.c
> > > +++ b/kernel/trace/bpf_trace.c
> >
> > [ ... ]
> >
> > > @@ -2085,12 +2085,26 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
> > >       run_ctx.bpf_cookie = link->cookie;
> > >       old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
> > >
> > > -     (void) bpf_prog_run(prog, args);
> > > +     if (prog->sleepable) {
> > > +             might_fault();
> > > +             (void)bpf_prog_run(prog, args);
> >
> > At this commit, __BPF_DECLARE_TRACE_SYSCALL in include/trace/bpf_probe.h
> > still wraps the BPF callback in preempt_disable_notrace() /
> > preempt_enable_notrace():
> >
> > include/trace/bpf_probe.h:__BPF_DECLARE_TRACE_SYSCALL() {
> >     might_fault();
> >     preempt_disable_notrace();
> >     bpf_trace_run##x(__data, ...);
> >     preempt_enable_notrace();
> > }
> >
> > Since __bpf_trace_run() is inlined into bpf_trace_run##x(), which is
> > called between the preempt_disable_notrace() and
> > preempt_enable_notrace(), the might_fault() here will execute with
> > preempt_count > 0, producing a warning on CONFIG_DEBUG_ATOMIC_SLEEP
> > kernels.
> >
> > More importantly, if the sleepable program calls a helper like
> > bpf_copy_from_user(), the page fault handler will see in_atomic()
> > and return -EFAULT for any non-resident page instead of faulting it
> > in, defeating the purpose of sleepable programs.
> >
> > The v1 RFC series included a separate patch to remove
> > preempt_disable_notrace() from __BPF_DECLARE_TRACE_SYSCALL.  Should
> > that change be folded into this commit so the sleepable path works
> > at each bisect point?
>
> But sleepable programs are not yet permitted, so I think this is not
> needed for bisection-friendliness.
>
> >
> > > +     } else {
> > > +             /*
> > > +              * Non-sleepable programs may run in the faultable context,
> > > +              * do cant_sleep() only if program is non-sleepable and context
> > > +              * is non-faultable.
> > > +              */
> > > +             if (!link->link.sleepable)
> > > +                     cant_sleep();
> > > +             rcu_read_lock();
> > > +             (void)bpf_prog_run(prog, args);
> > > +             rcu_read_unlock();
> > > +     }
> >
> > [ ... ]
> >
> > The commit message states:
> >
> > > For non-sleepable programs, behavior is unchanged: cant_sleep() check,
> > > rcu_read_lock() protection.
> >
> > The parent commit does not have a cant_sleep() call in
> > __bpf_trace_run() -- it uses rcu_read_lock_dont_migrate().  The
> > conditional cant_sleep() is new in this commit.  Could this sentence
> > be updated to reflect the actual change?
>
> This seems worth adjusting.

Actually, with the extra cant_sleep() check, we get this splat when
running normal raw_tp tests.
E.g.

...
[  295.211949] BUG: assuming atomic context at kernel/trace/bpf_trace.c:2098
[  295.212166] in_atomic(): 0, irqs_disabled(): 0, pid: 587, name: test_progs
[  295.212316] 3 locks held by test_progs/587:
...
[  295.213044] Call Trace:
[  295.213046]  <TASK>
[  295.213048]  dump_stack_lvl+0x54/0x70
[  295.213055]  __cant_sleep+0xb7/0xd0
[  295.213060]  bpf_trace_run1+0xce/0x340
[  295.213072]  bpf_testmod_test_read+0x433/0x5e0 [bpf_testmod]
[  295.213080]  ? srso_return_thunk+0x5/0x5f
[  295.213093]  kernfs_fop_read_iter+0x166/0x220
...

It means if the tracepoint is not marked faultable but is invoked in a
non-atomic context, we will get a warning.
It might be better to just drop this, since I don't think this adds much value.


>
> >
> >
>
> Other than that LGTM.
>
> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>
> > [...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs
  2026-03-06  4:41   ` Kumar Kartikeya Dwivedi
@ 2026-03-06 23:56     ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 19+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-03-06 23:56 UTC (permalink / raw)
  To: Mykyta Yatsenko
  Cc: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87,
	Mykyta Yatsenko

On Fri, 6 Mar 2026 at 05:41, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>
> On Wed, 25 Feb 2026 at 17:25, Mykyta Yatsenko
> <mykyta.yatsenko5@gmail.com> wrote:
> >
> > From: Mykyta Yatsenko <yatsenko@meta.com>
> >
> > Add two subtests:
> > - success: Attach a sleepable BPF program to the faultable sys_enter
> >   tracepoint (tp_btf.s/sys_enter). Verify the program is triggered by
> >   a syscall.
> > - reject_non_faultable: Attempt to attach a sleepable BPF program to
> >   a non-faultable tracepoint (tp_btf.s/sched_switch). Verify that
> >   attachment is rejected.
> >
> > Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
> > ---
>
> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>
> ... but see nit below.
>
> >  .../selftests/bpf/prog_tests/sleepable_raw_tp.c    | 56 ++++++++++++++++++++++
> >  .../selftests/bpf/progs/test_sleepable_raw_tp.c    | 43 +++++++++++++++++
> >  .../bpf/progs/test_sleepable_raw_tp_fail.c         | 16 +++++++
> >  tools/testing/selftests/bpf/verifier/sleepable.c   |  5 +-
> >  4 files changed, 117 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c
> > new file mode 100644
> > index 000000000000..9b0ec7cc4cac
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c
> > @@ -0,0 +1,56 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
> > +
> > +#include <test_progs.h>
> > +#include <time.h>
> > +#include "test_sleepable_raw_tp.skel.h"
> > +#include "test_sleepable_raw_tp_fail.skel.h"
> > +
> > +static void test_sleepable_raw_tp_success(void)
> > +{
> > +       struct test_sleepable_raw_tp *skel;
> > +       int err;
> > +
> > +       skel = test_sleepable_raw_tp__open_and_load();
> > +       if (!ASSERT_OK_PTR(skel, "skel_open_load"))
> > +               return;
> > +
> > +       skel->bss->target_pid = getpid();
>
> Shouldn't this be sys_gettid()? Otherwise the filter in prog only
> works for the main thread (pid == tgid) in parallel test_progs mode.
>

Answering myself, this is fine, since parallel mode uses fork() vs
pthread_create().

> > [...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs
  2026-02-25 16:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs Mykyta Yatsenko
  2026-03-06  4:41   ` Kumar Kartikeya Dwivedi
@ 2026-03-09 21:11   ` Jiri Olsa
  2026-03-10  0:22     ` Mykyta Yatsenko
  1 sibling, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2026-03-09 21:11 UTC (permalink / raw)
  To: Mykyta Yatsenko
  Cc: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87,
	Mykyta Yatsenko

On Wed, Feb 25, 2026 at 08:23:54AM -0800, Mykyta Yatsenko wrote:
> From: Mykyta Yatsenko <yatsenko@meta.com>
> 
> Add two subtests:
> - success: Attach a sleepable BPF program to the faultable sys_enter
>   tracepoint (tp_btf.s/sys_enter). Verify the program is triggered by
>   a syscall.
> - reject_non_faultable: Attempt to attach a sleepable BPF program to
>   a non-faultable tracepoint (tp_btf.s/sched_switch). Verify that
>   attachment is rejected.
> 
> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
> ---
>  .../selftests/bpf/prog_tests/sleepable_raw_tp.c    | 56 ++++++++++++++++++++++
>  .../selftests/bpf/progs/test_sleepable_raw_tp.c    | 43 +++++++++++++++++
>  .../bpf/progs/test_sleepable_raw_tp_fail.c         | 16 +++++++
>  tools/testing/selftests/bpf/verifier/sleepable.c   |  5 +-
>  4 files changed, 117 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c
> new file mode 100644
> index 000000000000..9b0ec7cc4cac
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c
> @@ -0,0 +1,56 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
> +
> +#include <test_progs.h>
> +#include <time.h>
> +#include "test_sleepable_raw_tp.skel.h"
> +#include "test_sleepable_raw_tp_fail.skel.h"
> +
> +static void test_sleepable_raw_tp_success(void)
> +{
> +	struct test_sleepable_raw_tp *skel;
> +	int err;
> +
> +	skel = test_sleepable_raw_tp__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "skel_open_load"))
> +		return;
> +
> +	skel->bss->target_pid = getpid();
> +
> +	err = test_sleepable_raw_tp__attach(skel);
> +	if (!ASSERT_OK(err, "skel_attach"))
> +		goto cleanup;
> +
> +	syscall(__NR_nanosleep, &(struct timespec){ .tv_nsec = 555 }, NULL);
> +
> +	ASSERT_EQ(skel->bss->triggered, 1, "triggered");
> +	ASSERT_EQ(skel->bss->err, 0, "err");
> +	ASSERT_EQ(skel->bss->copied_tv_nsec, 555, "copied_tv_nsec");
> +
> +cleanup:
> +	test_sleepable_raw_tp__destroy(skel);
> +}
> +
> +static void test_sleepable_raw_tp_reject(void)
> +{
> +	struct test_sleepable_raw_tp_fail *skel;
> +	int err;
> +
> +	skel = test_sleepable_raw_tp_fail__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "skel_open_load"))
> +		goto cleanup;
> +
> +	err = test_sleepable_raw_tp_fail__attach(skel);
> +	ASSERT_ERR(err, "skel_attach_should_fail");

would it be better to call RUN_TESTS(test_sleepable_raw_tp_fail) instead?
you could also check the verifier output

jirka


> +
> +cleanup:
> +	test_sleepable_raw_tp_fail__destroy(skel);
> +}
> +
> +void test_sleepable_raw_tp(void)
> +{
> +	if (test__start_subtest("success"))
> +		test_sleepable_raw_tp_success();
> +	if (test__start_subtest("reject_non_faultable"))
> +		test_sleepable_raw_tp_reject();
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c
> new file mode 100644
> index 000000000000..ebacc766df57
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c
> @@ -0,0 +1,43 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
> +
> +#include <vmlinux.h>
> +#include <asm/unistd.h>
> +#include <bpf/bpf_tracing.h>
> +#include <bpf/bpf_core_read.h>
> +#include <bpf/bpf_helpers.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +int target_pid;
> +int triggered;
> +long err;
> +long copied_tv_nsec;
> +
> +SEC("tp_btf.s/sys_enter")
> +int BPF_PROG(test_sleepable_sys_enter, struct pt_regs *regs, long id)
> +{
> +	struct task_struct *task = bpf_get_current_task_btf();
> +	struct __kernel_timespec *ts;
> +	long tv_nsec;
> +
> +	if (task->pid != target_pid)
> +		return 0;
> +
> +	if (id != __NR_nanosleep)
> +		return 0;
> +
> +	ts = (void *)PT_REGS_PARM1_CORE_SYSCALL(regs);
> +
> +	/*
> +	 * Use bpf_copy_from_user() - a sleepable helper - to read user memory.
> +	 * This exercises the sleepable execution path of raw tracepoints.
> +	 */
> +	err = bpf_copy_from_user(&tv_nsec, sizeof(tv_nsec), &ts->tv_nsec);
> +	if (err)
> +		return err;
> +
> +	copied_tv_nsec = tv_nsec;
> +	triggered = 1;
> +	return 0;
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c
> new file mode 100644
> index 000000000000..ef5dc3888df6
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c
> @@ -0,0 +1,16 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
> +
> +#include <vmlinux.h>
> +#include <bpf/bpf_tracing.h>
> +#include <bpf/bpf_helpers.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +/* Sleepable program on a non-faultable tracepoint should fail at attach */
> +SEC("tp_btf.s/sched_switch")
> +int BPF_PROG(test_sleepable_sched_switch, bool preempt,
> +	     struct task_struct *prev, struct task_struct *next)
> +{
> +	return 0;
> +}
> diff --git a/tools/testing/selftests/bpf/verifier/sleepable.c b/tools/testing/selftests/bpf/verifier/sleepable.c
> index 1f0d2bdc673f..39522b7cd317 100644
> --- a/tools/testing/selftests/bpf/verifier/sleepable.c
> +++ b/tools/testing/selftests/bpf/verifier/sleepable.c
> @@ -76,7 +76,7 @@
>  	.runs = -1,
>  },
>  {
> -	"sleepable raw tracepoint reject",
> +	"sleepable raw tracepoint accept",
>  	.insns = {
>  	BPF_MOV64_IMM(BPF_REG_0, 0),
>  	BPF_EXIT_INSN(),
> @@ -84,8 +84,7 @@
>  	.prog_type = BPF_PROG_TYPE_TRACING,
>  	.expected_attach_type = BPF_TRACE_RAW_TP,
>  	.kfunc = "sched_switch",
> -	.result = REJECT,
> -	.errstr = "Only fentry/fexit/fmod_ret, lsm, iter, uprobe, and struct_ops programs can be sleepable",
> +	.result = ACCEPT,
>  	.flags = BPF_F_SLEEPABLE,
>  	.runs = -1,
>  },
> 
> -- 
> 2.47.3
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs
  2026-03-09 21:11   ` Jiri Olsa
@ 2026-03-10  0:22     ` Mykyta Yatsenko
  0 siblings, 0 replies; 19+ messages in thread
From: Mykyta Yatsenko @ 2026-03-10  0:22 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: bpf, ast, andrii, daniel, kafai, kernel-team, eddyz87,
	Mykyta Yatsenko

On 3/9/26 9:11 PM, Jiri Olsa wrote:
> On Wed, Feb 25, 2026 at 08:23:54AM -0800, Mykyta Yatsenko wrote:
>> From: Mykyta Yatsenko <yatsenko@meta.com>
>>
>> Add two subtests:
>> - success: Attach a sleepable BPF program to the faultable sys_enter
>>    tracepoint (tp_btf.s/sys_enter). Verify the program is triggered by
>>    a syscall.
>> - reject_non_faultable: Attempt to attach a sleepable BPF program to
>>    a non-faultable tracepoint (tp_btf.s/sched_switch). Verify that
>>    attachment is rejected.
>>
>> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
>> ---
>>   .../selftests/bpf/prog_tests/sleepable_raw_tp.c    | 56 ++++++++++++++++++++++
>>   .../selftests/bpf/progs/test_sleepable_raw_tp.c    | 43 +++++++++++++++++
>>   .../bpf/progs/test_sleepable_raw_tp_fail.c         | 16 +++++++
>>   tools/testing/selftests/bpf/verifier/sleepable.c   |  5 +-
>>   4 files changed, 117 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c
>> new file mode 100644
>> index 000000000000..9b0ec7cc4cac
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/prog_tests/sleepable_raw_tp.c
>> @@ -0,0 +1,56 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
>> +
>> +#include <test_progs.h>
>> +#include <time.h>
>> +#include "test_sleepable_raw_tp.skel.h"
>> +#include "test_sleepable_raw_tp_fail.skel.h"
>> +
>> +static void test_sleepable_raw_tp_success(void)
>> +{
>> +	struct test_sleepable_raw_tp *skel;
>> +	int err;
>> +
>> +	skel = test_sleepable_raw_tp__open_and_load();
>> +	if (!ASSERT_OK_PTR(skel, "skel_open_load"))
>> +		return;
>> +
>> +	skel->bss->target_pid = getpid();
>> +
>> +	err = test_sleepable_raw_tp__attach(skel);
>> +	if (!ASSERT_OK(err, "skel_attach"))
>> +		goto cleanup;
>> +
>> +	syscall(__NR_nanosleep, &(struct timespec){ .tv_nsec = 555 }, NULL);
>> +
>> +	ASSERT_EQ(skel->bss->triggered, 1, "triggered");
>> +	ASSERT_EQ(skel->bss->err, 0, "err");
>> +	ASSERT_EQ(skel->bss->copied_tv_nsec, 555, "copied_tv_nsec");
>> +
>> +cleanup:
>> +	test_sleepable_raw_tp__destroy(skel);
>> +}
>> +
>> +static void test_sleepable_raw_tp_reject(void)
>> +{
>> +	struct test_sleepable_raw_tp_fail *skel;
>> +	int err;
>> +
>> +	skel = test_sleepable_raw_tp_fail__open_and_load();
>> +	if (!ASSERT_OK_PTR(skel, "skel_open_load"))
>> +		goto cleanup;
>> +
>> +	err = test_sleepable_raw_tp_fail__attach(skel);
>> +	ASSERT_ERR(err, "skel_attach_should_fail");
> 
> would it be better to call RUN_TESTS(test_sleepable_raw_tp_fail) instead?
> you could also check the verifier output
> 
> jirka
yes, definitely, thanks!
> 
> 
>> +
>> +cleanup:
>> +	test_sleepable_raw_tp_fail__destroy(skel);
>> +}
>> +
>> +void test_sleepable_raw_tp(void)
>> +{
>> +	if (test__start_subtest("success"))
>> +		test_sleepable_raw_tp_success();
>> +	if (test__start_subtest("reject_non_faultable"))
>> +		test_sleepable_raw_tp_reject();
>> +}
>> diff --git a/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c
>> new file mode 100644
>> index 000000000000..ebacc766df57
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp.c
>> @@ -0,0 +1,43 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
>> +
>> +#include <vmlinux.h>
>> +#include <asm/unistd.h>
>> +#include <bpf/bpf_tracing.h>
>> +#include <bpf/bpf_core_read.h>
>> +#include <bpf/bpf_helpers.h>
>> +
>> +char _license[] SEC("license") = "GPL";
>> +
>> +int target_pid;
>> +int triggered;
>> +long err;
>> +long copied_tv_nsec;
>> +
>> +SEC("tp_btf.s/sys_enter")
>> +int BPF_PROG(test_sleepable_sys_enter, struct pt_regs *regs, long id)
>> +{
>> +	struct task_struct *task = bpf_get_current_task_btf();
>> +	struct __kernel_timespec *ts;
>> +	long tv_nsec;
>> +
>> +	if (task->pid != target_pid)
>> +		return 0;
>> +
>> +	if (id != __NR_nanosleep)
>> +		return 0;
>> +
>> +	ts = (void *)PT_REGS_PARM1_CORE_SYSCALL(regs);
>> +
>> +	/*
>> +	 * Use bpf_copy_from_user() - a sleepable helper - to read user memory.
>> +	 * This exercises the sleepable execution path of raw tracepoints.
>> +	 */
>> +	err = bpf_copy_from_user(&tv_nsec, sizeof(tv_nsec), &ts->tv_nsec);
>> +	if (err)
>> +		return err;
>> +
>> +	copied_tv_nsec = tv_nsec;
>> +	triggered = 1;
>> +	return 0;
>> +}
>> diff --git a/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c
>> new file mode 100644
>> index 000000000000..ef5dc3888df6
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/test_sleepable_raw_tp_fail.c
>> @@ -0,0 +1,16 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
>> +
>> +#include <vmlinux.h>
>> +#include <bpf/bpf_tracing.h>
>> +#include <bpf/bpf_helpers.h>
>> +
>> +char _license[] SEC("license") = "GPL";
>> +
>> +/* Sleepable program on a non-faultable tracepoint should fail at attach */
>> +SEC("tp_btf.s/sched_switch")
>> +int BPF_PROG(test_sleepable_sched_switch, bool preempt,
>> +	     struct task_struct *prev, struct task_struct *next)
>> +{
>> +	return 0;
>> +}
>> diff --git a/tools/testing/selftests/bpf/verifier/sleepable.c b/tools/testing/selftests/bpf/verifier/sleepable.c
>> index 1f0d2bdc673f..39522b7cd317 100644
>> --- a/tools/testing/selftests/bpf/verifier/sleepable.c
>> +++ b/tools/testing/selftests/bpf/verifier/sleepable.c
>> @@ -76,7 +76,7 @@
>>   	.runs = -1,
>>   },
>>   {
>> -	"sleepable raw tracepoint reject",
>> +	"sleepable raw tracepoint accept",
>>   	.insns = {
>>   	BPF_MOV64_IMM(BPF_REG_0, 0),
>>   	BPF_EXIT_INSN(),
>> @@ -84,8 +84,7 @@
>>   	.prog_type = BPF_PROG_TYPE_TRACING,
>>   	.expected_attach_type = BPF_TRACE_RAW_TP,
>>   	.kfunc = "sched_switch",
>> -	.result = REJECT,
>> -	.errstr = "Only fentry/fexit/fmod_ret, lsm, iter, uprobe, and struct_ops programs can be sleepable",
>> +	.result = ACCEPT,
>>   	.flags = BPF_F_SLEEPABLE,
>>   	.runs = -1,
>>   },
>>
>> -- 
>> 2.47.3
>>
>>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs
  2026-03-06  4:52       ` Kumar Kartikeya Dwivedi
@ 2026-03-10 17:38         ` Yonghong Song
  0 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2026-03-10 17:38 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi, bot+bpf-ci
  Cc: mykyta.yatsenko5, bpf, ast, andrii, daniel, kafai, kernel-team,
	eddyz87, yatsenko, martin.lau, clm, ihor.solodrai



On 3/5/26 8:52 PM, Kumar Kartikeya Dwivedi wrote:
> On Fri, 6 Mar 2026 at 05:23, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>> On Wed, 25 Feb 2026 at 18:14, <bot+bpf-ci@kernel.org> wrote:
>>>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>>>> index 9bc0dfd235af..2dd345e0fdb0 100644
>>>> --- a/kernel/trace/bpf_trace.c
>>>> +++ b/kernel/trace/bpf_trace.c
>>> [ ... ]
>>>
>>>> @@ -2085,12 +2085,26 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
>>>>        run_ctx.bpf_cookie = link->cookie;
>>>>        old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
>>>>
>>>> -     (void) bpf_prog_run(prog, args);
>>>> +     if (prog->sleepable) {
>>>> +             might_fault();
>>>> +             (void)bpf_prog_run(prog, args);
>>> At this commit, __BPF_DECLARE_TRACE_SYSCALL in include/trace/bpf_probe.h
>>> still wraps the BPF callback in preempt_disable_notrace() /
>>> preempt_enable_notrace():
>>>
>>> include/trace/bpf_probe.h:__BPF_DECLARE_TRACE_SYSCALL() {
>>>      might_fault();
>>>      preempt_disable_notrace();
>>>      bpf_trace_run##x(__data, ...);
>>>      preempt_enable_notrace();
>>> }
>>>
>>> Since __bpf_trace_run() is inlined into bpf_trace_run##x(), which is
>>> called between the preempt_disable_notrace() and
>>> preempt_enable_notrace(), the might_fault() here will execute with
>>> preempt_count > 0, producing a warning on CONFIG_DEBUG_ATOMIC_SLEEP
>>> kernels.
>>>
>>> More importantly, if the sleepable program calls a helper like
>>> bpf_copy_from_user(), the page fault handler will see in_atomic()
>>> and return -EFAULT for any non-resident page instead of faulting it
>>> in, defeating the purpose of sleepable programs.
>>>
>>> The v1 RFC series included a separate patch to remove
>>> preempt_disable_notrace() from __BPF_DECLARE_TRACE_SYSCALL.  Should
>>> that change be folded into this commit so the sleepable path works
>>> at each bisect point?
>> But sleepable programs are not yet permitted, so I think this is not
>> needed for bisection-friendliness.
>>
>>>> +     } else {
>>>> +             /*
>>>> +              * Non-sleepable programs may run in the faultable context,
>>>> +              * do cant_sleep() only if program is non-sleepable and context
>>>> +              * is non-faultable.
>>>> +              */
>>>> +             if (!link->link.sleepable)
>>>> +                     cant_sleep();
>>>> +             rcu_read_lock();
>>>> +             (void)bpf_prog_run(prog, args);
>>>> +             rcu_read_unlock();
>>>> +     }
>>> [ ... ]
>>>
>>> The commit message states:
>>>
>>>> For non-sleepable programs, behavior is unchanged: cant_sleep() check,
>>>> rcu_read_lock() protection.
>>> The parent commit does not have a cant_sleep() call in
>>> __bpf_trace_run() -- it uses rcu_read_lock_dont_migrate().  The
>>> conditional cant_sleep() is new in this commit.  Could this sentence
>>> be updated to reflect the actual change?
>> This seems worth adjusting.
> Actually, with the extra cant_sleep() check, we get this splat when
> running normal raw_tp tests.
> E.g.
>
> ...
> [  295.211949] BUG: assuming atomic context at kernel/trace/bpf_trace.c:2098
> [  295.212166] in_atomic(): 0, irqs_disabled(): 0, pid: 587, name: test_progs
> [  295.212316] 3 locks held by test_progs/587:
> ...
> [  295.213044] Call Trace:
> [  295.213046]  <TASK>
> [  295.213048]  dump_stack_lvl+0x54/0x70
> [  295.213055]  __cant_sleep+0xb7/0xd0
> [  295.213060]  bpf_trace_run1+0xce/0x340
> [  295.213072]  bpf_testmod_test_read+0x433/0x5e0 [bpf_testmod]
> [  295.213080]  ? srso_return_thunk+0x5/0x5f
> [  295.213093]  kernfs_fop_read_iter+0x166/0x220
> ...
>
> It means if the tracepoint is not marked faultable but is invoked in a
> non-atomic context, we will get a warning.
> It might be better to just drop this, since I don't think this adds much value.

Agree. This is due to the tracepoints defined in test_kmods where
tracepoints have default non-sleepable. So we have
   prog: non-sleepable
   test_kmods assumed context: non-sleepable
   actual context: sleepable

This caused the warning.

BTW, can we merge patches 2 and 3 together? I know the sleepable tp_btf
is not enabled yet in verifier. But just looking at patch 2 is a little
bit confusing as its caller still has preempt_{disable,enable}_notrace().

>
>
>>>
>> Other than that LGTM.
>>
>> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>>
>>> [...]


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2026-03-10 17:38 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25 16:23 [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs Mykyta Yatsenko
2026-02-25 16:23 ` [PATCH bpf-next v2 1/6] bpf: Reject sleepable raw_tp programs on non-faultable tracepoints Mykyta Yatsenko
2026-03-06  3:59   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs Mykyta Yatsenko
2026-02-25 17:12   ` bot+bpf-ci
2026-03-06  4:23     ` Kumar Kartikeya Dwivedi
2026-03-06  4:52       ` Kumar Kartikeya Dwivedi
2026-03-10 17:38         ` Yonghong Song
2026-02-25 16:23 ` [PATCH bpf-next v2 3/6] bpf: Remove preempt_disable from faultable tracepoint BPF callbacks Mykyta Yatsenko
2026-03-06  4:25   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 4/6] bpf: Allow sleepable programs for BPF_TRACE_RAW_TP attach type Mykyta Yatsenko
2026-03-06  4:26   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 5/6] libbpf: Add tp_btf.s section handler for sleepable raw tracepoints Mykyta Yatsenko
2026-03-06  4:26   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs Mykyta Yatsenko
2026-03-06  4:41   ` Kumar Kartikeya Dwivedi
2026-03-06 23:56     ` Kumar Kartikeya Dwivedi
2026-03-09 21:11   ` Jiri Olsa
2026-03-10  0:22     ` Mykyta Yatsenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox