netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG
@ 2025-10-13 17:11 Sahil Chandna
  2025-10-13 18:35 ` Yonghong Song
  0 siblings, 1 reply; 4+ messages in thread
From: Sahil Chandna @ 2025-10-13 17:11 UTC (permalink / raw)
  To: ast, daniel, andrii, martin.lau, song, john.fastabend, haoluo,
	jolsa, bpf, netdev
  Cc: david.hunter.linux, skhan, khalid, chandna.linuxkernel,
	syzbot+1f1fbecb9413cdbfbef8

The timer mode is initialized to NO_PREEMPT mode by default,
this disable preemption and force execution in atomic context
causing issue on PREEMPT_RT configurations when invoking
spin_lock_bh(), leading to the following warning:

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6107, name: syz.0.17
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
Preemption disabled at:
[<ffffffff891fce58>] bpf_test_timer_enter+0xf8/0x140 net/bpf/test_run.c:42

Fix this, by removing NO_PREEMPT/NO_MIGRATE mode check.
Also, the test timer context no longer needs explicit calls to
migrate_disable()/migrate_enable() with rcu_read_lock()/rcu_read_unlock().
Use helpers rcu_read_lock_dont_migrate() and rcu_read_unlock_migrate()
instead.

Reported-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1f1fbecb9413cdbfbef8
Tested-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com
Signed-off-by: Sahil Chandna <chandna.linuxkernel@gmail.com>

---
Changes since v2:
- Fix uninitialized struct bpf_test_timer

Changes since v1:
- Dropped `enum { NO_PREEMPT, NO_MIGRATE } mode` from `struct bpf_test_timer`.
- Removed all conditional preempt/migrate disable logic.
- Unified timer handling to use `migrate_disable()` / `migrate_enable()` universally.

Link to v2: https://lore.kernel.org/all/20251010075923.408195-1-chandna.linuxkernel@gmail.com/
Link to v1: https://lore.kernel.org/all/20251006054320.159321-1-chandna.linuxkernel@gmail.com/

Testing:
- Reproduced syzbot bug locally using the provided reproducer.
- Observed `BUG: sleeping function called from invalid context` on v1.
- Confirmed bug disappears after applying this patch.
- Validated normal functionality of `bpf_prog_test_run_*` helpers with C
  reproducer.
---
 net/bpf/test_run.c | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index dfb03ee0bb62..f1719ea7a037 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -29,7 +29,6 @@
 #include <trace/events/bpf_test_run.h>
 
 struct bpf_test_timer {
-	enum { NO_PREEMPT, NO_MIGRATE } mode;
 	u32 i;
 	u64 time_start, time_spent;
 };
@@ -37,12 +36,7 @@ struct bpf_test_timer {
 static void bpf_test_timer_enter(struct bpf_test_timer *t)
 	__acquires(rcu)
 {
-	rcu_read_lock();
-	if (t->mode == NO_PREEMPT)
-		preempt_disable();
-	else
-		migrate_disable();
-
+	rcu_read_lock_dont_migrate();
 	t->time_start = ktime_get_ns();
 }
 
@@ -50,12 +44,7 @@ static void bpf_test_timer_leave(struct bpf_test_timer *t)
 	__releases(rcu)
 {
 	t->time_start = 0;
-
-	if (t->mode == NO_PREEMPT)
-		preempt_enable();
-	else
-		migrate_enable();
-	rcu_read_unlock();
+	rcu_read_unlock_migrate();
 }
 
 static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations,
@@ -374,7 +363,7 @@ static int bpf_test_run_xdp_live(struct bpf_prog *prog, struct xdp_buff *ctx,
 
 {
 	struct xdp_test_data xdp = { .batch_size = batch_size };
-	struct bpf_test_timer t = { .mode = NO_MIGRATE };
+	struct bpf_test_timer t = {};
 	int ret;
 
 	if (!repeat)
@@ -404,7 +393,7 @@ static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat,
 	struct bpf_prog_array_item item = {.prog = prog};
 	struct bpf_run_ctx *old_ctx;
 	struct bpf_cg_run_ctx run_ctx;
-	struct bpf_test_timer t = { NO_MIGRATE };
+	struct bpf_test_timer t = {};
 	enum bpf_cgroup_storage_type stype;
 	int ret;
 
@@ -1377,7 +1366,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 				     const union bpf_attr *kattr,
 				     union bpf_attr __user *uattr)
 {
-	struct bpf_test_timer t = { NO_PREEMPT };
+	struct bpf_test_timer t = {};
 	u32 size = kattr->test.data_size_in;
 	struct bpf_flow_dissector ctx = {};
 	u32 repeat = kattr->test.repeat;
@@ -1445,7 +1434,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kattr,
 				union bpf_attr __user *uattr)
 {
-	struct bpf_test_timer t = { NO_PREEMPT };
+	struct bpf_test_timer t = {};
 	struct bpf_prog_array *progs = NULL;
 	struct bpf_sk_lookup_kern ctx = {};
 	u32 repeat = kattr->test.repeat;
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG
  2025-10-13 17:11 [PATCH v3] bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG Sahil Chandna
@ 2025-10-13 18:35 ` Yonghong Song
  2025-10-13 20:01   ` Brahmajit Das
  0 siblings, 1 reply; 4+ messages in thread
From: Yonghong Song @ 2025-10-13 18:35 UTC (permalink / raw)
  To: Sahil Chandna, ast, daniel, andrii, martin.lau, song,
	john.fastabend, haoluo, jolsa, bpf, netdev
  Cc: david.hunter.linux, skhan, khalid



On 10/13/25 10:11 AM, Sahil Chandna wrote:
> The timer mode is initialized to NO_PREEMPT mode by default,
> this disable preemption and force execution in atomic context
> causing issue on PREEMPT_RT configurations when invoking
> spin_lock_bh(), leading to the following warning:
>
> BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6107, name: syz.0.17
> preempt_count: 1, expected: 0
> RCU nest depth: 1, expected: 1
> Preemption disabled at:
> [<ffffffff891fce58>] bpf_test_timer_enter+0xf8/0x140 net/bpf/test_run.c:42
>
> Fix this, by removing NO_PREEMPT/NO_MIGRATE mode check.
> Also, the test timer context no longer needs explicit calls to
> migrate_disable()/migrate_enable() with rcu_read_lock()/rcu_read_unlock().
> Use helpers rcu_read_lock_dont_migrate() and rcu_read_unlock_migrate()
> instead.
>
> Reported-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=1f1fbecb9413cdbfbef8
> Tested-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com
> Signed-off-by: Sahil Chandna <chandna.linuxkernel@gmail.com>

You have multiple versions in CI:
   [PATCH v2] bpf: avoid sleeping in invalid context during sock_map_delete_elem path
   [PATCH v3] bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG

In the future, please submit new patch set only after some reviews on the old patch.

I also recommend to replace e.g. [PATCH v3] to [PATCH bpf v3] (or [PATCH bpf-next v3])
so CI can do proper testing for either bpf or bpf-next.

For the title:
   bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG
Change to:
   bpf: Fix sleep-in-atomic BUG in timer path with RT kernel

The code change LGTM.

Acked-by: Yonghong Song <yonghong.song@linux.dev>

>
> ---
> Changes since v2:
> - Fix uninitialized struct bpf_test_timer
>
> Changes since v1:
> - Dropped `enum { NO_PREEMPT, NO_MIGRATE } mode` from `struct bpf_test_timer`.
> - Removed all conditional preempt/migrate disable logic.
> - Unified timer handling to use `migrate_disable()` / `migrate_enable()` universally.
>
> Link to v2: https://lore.kernel.org/all/20251010075923.408195-1-chandna.linuxkernel@gmail.com/
> Link to v1: https://lore.kernel.org/all/20251006054320.159321-1-chandna.linuxkernel@gmail.com/
>
> Testing:
> - Reproduced syzbot bug locally using the provided reproducer.
> - Observed `BUG: sleeping function called from invalid context` on v1.
> - Confirmed bug disappears after applying this patch.
> - Validated normal functionality of `bpf_prog_test_run_*` helpers with C
>    reproducer.
> ---
>   net/bpf/test_run.c | 23 ++++++-----------------
>   1 file changed, 6 insertions(+), 17 deletions(-)

[...]


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG
  2025-10-13 18:35 ` Yonghong Song
@ 2025-10-13 20:01   ` Brahmajit Das
  2025-10-13 20:06     ` Brahmajit Das
  0 siblings, 1 reply; 4+ messages in thread
From: Brahmajit Das @ 2025-10-13 20:01 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Sahil Chandna, ast, daniel, andrii, martin.lau, song,
	john.fastabend, haoluo, jolsa, bpf, netdev, david.hunter.linux,
	skhan, khalid

On 13.10.2025 11:35, Yonghong Song wrote:
> 
> 
> On 10/13/25 10:11 AM, Sahil Chandna wrote:
> > The timer mode is initialized to NO_PREEMPT mode by default,
> > this disable preemption and force execution in atomic context
> > causing issue on PREEMPT_RT configurations when invoking
> > spin_lock_bh(), leading to the following warning:
> > 
> > BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6107, name: syz.0.17
> > preempt_count: 1, expected: 0
> > RCU nest depth: 1, expected: 1
> > Preemption disabled at:
> > [<ffffffff891fce58>] bpf_test_timer_enter+0xf8/0x140 net/bpf/test_run.c:42
> > 
> > Fix this, by removing NO_PREEMPT/NO_MIGRATE mode check.
> > Also, the test timer context no longer needs explicit calls to
> > migrate_disable()/migrate_enable() with rcu_read_lock()/rcu_read_unlock().
> > Use helpers rcu_read_lock_dont_migrate() and rcu_read_unlock_migrate()
> > instead.
> > 
> > Reported-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=1f1fbecb9413cdbfbef8
> > Tested-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com
> > Signed-off-by: Sahil Chandna <chandna.linuxkernel@gmail.com>
> 
> You have multiple versions in CI:
>   [PATCH v2] bpf: avoid sleeping in invalid context during sock_map_delete_elem path
>   [PATCH v3] bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG
Yeah, my bad. The v2 is mine, which I send few mins before Sahil

https://lore.kernel.org/all/20251013171122.1403859-1-listout@listout.xyz/T/
> 
> In the future, please submit new patch set only after some reviews on the old patch.
> 
> I also recommend to replace e.g. [PATCH v3] to [PATCH bpf v3] (or [PATCH bpf-next v3])
> so CI can do proper testing for either bpf or bpf-next.
> 
> For the title:
>   bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG
> Change to:
>   bpf: Fix sleep-in-atomic BUG in timer path with RT kernel
> 
> The code change LGTM.
> 
> Acked-by: Yonghong Song <yonghong.song@linux.dev>
> 
> > 
> > ---
> > Changes since v2:
> > - Fix uninitialized struct bpf_test_timer
> > 
> > Changes since v1:
> > - Dropped `enum { NO_PREEMPT, NO_MIGRATE } mode` from `struct bpf_test_timer`.
> > - Removed all conditional preempt/migrate disable logic.
> > - Unified timer handling to use `migrate_disable()` / `migrate_enable()` universally.
> > 
> > Link to v2: https://lore.kernel.org/all/20251010075923.408195-1-chandna.linuxkernel@gmail.com/
> > Link to v1: https://lore.kernel.org/all/20251006054320.159321-1-chandna.linuxkernel@gmail.com/
> > 
> > Testing:
> > - Reproduced syzbot bug locally using the provided reproducer.
> > - Observed `BUG: sleeping function called from invalid context` on v1.
> > - Confirmed bug disappears after applying this patch.
> > - Validated normal functionality of `bpf_prog_test_run_*` helpers with C
> >    reproducer.
> > ---
> >   net/bpf/test_run.c | 23 ++++++-----------------
> >   1 file changed, 6 insertions(+), 17 deletions(-)
> 
> [...]
> 

-- 
Regards,
listout

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG
  2025-10-13 20:01   ` Brahmajit Das
@ 2025-10-13 20:06     ` Brahmajit Das
  0 siblings, 0 replies; 4+ messages in thread
From: Brahmajit Das @ 2025-10-13 20:06 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Sahil Chandna, ast, daniel, andrii, martin.lau, song,
	john.fastabend, haoluo, jolsa, bpf, netdev, david.hunter.linux,
	skhan, khalid

On 14.10.2025 01:31, Brahmajit Das wrote:
> On 13.10.2025 11:35, Yonghong Song wrote:
> > 
> > 
> > On 10/13/25 10:11 AM, Sahil Chandna wrote:
> > > The timer mode is initialized to NO_PREEMPT mode by default,
...snip...
> Yeah, my bad. The v2 is mine, which I send few mins before Sahil
> 
> https://lore.kernel.org/all/20251013171122.1403859-1-listout@listout.xyz/T/
https://lore.kernel.org/all/20251013171122.1403859-1-listout@listout.xyz/
> > 
> > In the future, please submit new patch set only after some reviews on the old patch.
> > 
> > I also recommend to replace e.g. [PATCH v3] to [PATCH bpf v3] (or [PATCH bpf-next v3])
> > so CI can do proper testing for either bpf or bpf-next.
> > 
> > For the title:
> >   bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG
> > Change to:
> >   bpf: Fix sleep-in-atomic BUG in timer path with RT kernel
> > 
> > The code change LGTM.
> > 
> > Acked-by: Yonghong Song <yonghong.song@linux.dev>
> > 
> > > 
> > > ---
> > > Changes since v2:
> > > - Fix uninitialized struct bpf_test_timer
...snip...

-- 
Regards,
listout

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-10-13 20:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-13 17:11 [PATCH v3] bpf: test_run: fix atomic context in timer path causing sleep-in-atomic BUG Sahil Chandna
2025-10-13 18:35 ` Yonghong Song
2025-10-13 20:01   ` Brahmajit Das
2025-10-13 20:06     ` Brahmajit Das

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).