public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 bpf] bpf: Disable migration in nf_hook_run_bpf().
@ 2025-07-17 18:58 Kuniyuki Iwashima
  2025-07-17 20:54 ` Alexei Starovoitov
  0 siblings, 1 reply; 4+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-17 18:58 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Daniel Xu, Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	Kuniyuki Iwashima, Kuniyuki Iwashima, bpf, netdev,
	netfilter-devel, syzbot+40f772d37250b6d10efc

syzbot reported that the IP defrag bpf prog can be called without
migration disabled.

Then the assertion in __bpf_prog_run() fails, triggering the splat
below. [0]

Let's call migrate_disable() before calling bpf_prog_run() in
nf_hook_run_bpf().

[0]:
BUG: assuming non migratable context at ./include/linux/filter.h:703
in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 5829, name: sshd-session
3 locks held by sshd-session/5829:
 #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1667 [inline]
 #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendmsg+0x20/0x50 net/ipv4/tcp.c:1395
 #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
 #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
 #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: __ip_queue_xmit+0x69/0x26c0 net/ipv4/ip_output.c:470
 #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
 #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
 #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: nf_hook+0xb2/0x680 include/linux/netfilter.h:241
CPU: 0 UID: 0 PID: 5829 Comm: sshd-session Not tainted 6.16.0-rc6-syzkaller-00002-g155a3c003e55 #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120
 __cant_migrate kernel/sched/core.c:8860 [inline]
 __cant_migrate+0x1c7/0x250 kernel/sched/core.c:8834
 __bpf_prog_run include/linux/filter.h:703 [inline]
 bpf_prog_run include/linux/filter.h:725 [inline]
 nf_hook_run_bpf+0x83/0x1e0 net/netfilter/nf_bpf_link.c:20
 nf_hook_entry_hookfn include/linux/netfilter.h:157 [inline]
 nf_hook_slow+0xbb/0x200 net/netfilter/core.c:623
 nf_hook+0x370/0x680 include/linux/netfilter.h:272
 NF_HOOK_COND include/linux/netfilter.h:305 [inline]
 ip_output+0x1bc/0x2a0 net/ipv4/ip_output.c:433
 dst_output include/net/dst.h:459 [inline]
 ip_local_out net/ipv4/ip_output.c:129 [inline]
 __ip_queue_xmit+0x1d7d/0x26c0 net/ipv4/ip_output.c:527
 __tcp_transmit_skb+0x2686/0x3e90 net/ipv4/tcp_output.c:1479
 tcp_transmit_skb net/ipv4/tcp_output.c:1497 [inline]
 tcp_write_xmit+0x1274/0x84e0 net/ipv4/tcp_output.c:2838
 __tcp_push_pending_frames+0xaf/0x390 net/ipv4/tcp_output.c:3021
 tcp_push+0x225/0x700 net/ipv4/tcp.c:759
 tcp_sendmsg_locked+0x1870/0x42b0 net/ipv4/tcp.c:1359
 tcp_sendmsg+0x2e/0x50 net/ipv4/tcp.c:1396
 inet_sendmsg+0xb9/0x140 net/ipv4/af_inet.c:851
 sock_sendmsg_nosec net/socket.c:712 [inline]
 __sock_sendmsg net/socket.c:727 [inline]
 sock_write_iter+0x4aa/0x5b0 net/socket.c:1131
 new_sync_write fs/read_write.c:593 [inline]
 vfs_write+0x6c7/0x1150 fs/read_write.c:686
 ksys_write+0x1f8/0x250 fs/read_write.c:738
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fe7d365d407
Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
RSP:

Fixes: 91721c2d02d3 ("netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link")
Reported-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6879466d.a00a0220.3af5df.0022.GAE@google.com/
Tested-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/netfilter/nf_bpf_link.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c
index 06b0848447003..dffe4cd6f4b0b 100644
--- a/net/netfilter/nf_bpf_link.c
+++ b/net/netfilter/nf_bpf_link.c
@@ -16,8 +16,13 @@ static unsigned int nf_hook_run_bpf(void *bpf_prog, struct sk_buff *skb,
 		.state = s,
 		.skb = skb,
 	};
+	unsigned int ret;
 
-	return bpf_prog_run(prog, &ctx);
+	migrate_disable();
+	ret = bpf_prog_run(prog, &ctx);
+	migrate_enable();
+
+	return ret;
 }
 
 struct bpf_nf_link {
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v1 bpf] bpf: Disable migration in nf_hook_run_bpf().
  2025-07-17 18:58 [PATCH v1 bpf] bpf: Disable migration in nf_hook_run_bpf() Kuniyuki Iwashima
@ 2025-07-17 20:54 ` Alexei Starovoitov
  2025-07-17 21:26   ` Florian Westphal
  0 siblings, 1 reply; 4+ messages in thread
From: Alexei Starovoitov @ 2025-07-17 20:54 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Daniel Xu,
	Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	Kuniyuki Iwashima, bpf, Network Development, netfilter-devel,
	syzbot+40f772d37250b6d10efc

On Thu, Jul 17, 2025 at 11:58 AM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>
> syzbot reported that the IP defrag bpf prog can be called without
> migration disabled.
>
> Then the assertion in __bpf_prog_run() fails, triggering the splat
> below. [0]
>
> Let's call migrate_disable() before calling bpf_prog_run() in
> nf_hook_run_bpf().
>
> [0]:
> BUG: assuming non migratable context at ./include/linux/filter.h:703
> in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 5829, name: sshd-session
> 3 locks held by sshd-session/5829:
>  #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1667 [inline]
>  #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendmsg+0x20/0x50 net/ipv4/tcp.c:1395
>  #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
>  #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
>  #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: __ip_queue_xmit+0x69/0x26c0 net/ipv4/ip_output.c:470
>  #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
>  #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
>  #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: nf_hook+0xb2/0x680 include/linux/netfilter.h:241
> CPU: 0 UID: 0 PID: 5829 Comm: sshd-session Not tainted 6.16.0-rc6-syzkaller-00002-g155a3c003e55 #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
> Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:94 [inline]
>  dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120
>  __cant_migrate kernel/sched/core.c:8860 [inline]
>  __cant_migrate+0x1c7/0x250 kernel/sched/core.c:8834
>  __bpf_prog_run include/linux/filter.h:703 [inline]
>  bpf_prog_run include/linux/filter.h:725 [inline]
>  nf_hook_run_bpf+0x83/0x1e0 net/netfilter/nf_bpf_link.c:20
>  nf_hook_entry_hookfn include/linux/netfilter.h:157 [inline]
>  nf_hook_slow+0xbb/0x200 net/netfilter/core.c:623
>  nf_hook+0x370/0x680 include/linux/netfilter.h:272
>  NF_HOOK_COND include/linux/netfilter.h:305 [inline]
>  ip_output+0x1bc/0x2a0 net/ipv4/ip_output.c:433
>  dst_output include/net/dst.h:459 [inline]
>  ip_local_out net/ipv4/ip_output.c:129 [inline]
>  __ip_queue_xmit+0x1d7d/0x26c0 net/ipv4/ip_output.c:527
>  __tcp_transmit_skb+0x2686/0x3e90 net/ipv4/tcp_output.c:1479
>  tcp_transmit_skb net/ipv4/tcp_output.c:1497 [inline]
>  tcp_write_xmit+0x1274/0x84e0 net/ipv4/tcp_output.c:2838
>  __tcp_push_pending_frames+0xaf/0x390 net/ipv4/tcp_output.c:3021
>  tcp_push+0x225/0x700 net/ipv4/tcp.c:759
>  tcp_sendmsg_locked+0x1870/0x42b0 net/ipv4/tcp.c:1359
>  tcp_sendmsg+0x2e/0x50 net/ipv4/tcp.c:1396
>  inet_sendmsg+0xb9/0x140 net/ipv4/af_inet.c:851
>  sock_sendmsg_nosec net/socket.c:712 [inline]
>  __sock_sendmsg net/socket.c:727 [inline]
>  sock_write_iter+0x4aa/0x5b0 net/socket.c:1131
>  new_sync_write fs/read_write.c:593 [inline]
>  vfs_write+0x6c7/0x1150 fs/read_write.c:686
>  ksys_write+0x1f8/0x250 fs/read_write.c:738
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7fe7d365d407
> Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
> RSP:
>
> Fixes: 91721c2d02d3 ("netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link")

Fixes tag looks wrong.
I don't think it's Daniel's defrag series.
No idea why syzbot bisected it to this commit.

This is just a regular xmit path. Not related to defrag.

> Reported-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/6879466d.a00a0220.3af5df.0022.GAE@google.com/
> Tested-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---
>  net/netfilter/nf_bpf_link.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c
> index 06b0848447003..dffe4cd6f4b0b 100644
> --- a/net/netfilter/nf_bpf_link.c
> +++ b/net/netfilter/nf_bpf_link.c
> @@ -16,8 +16,13 @@ static unsigned int nf_hook_run_bpf(void *bpf_prog, struct sk_buff *skb,
>                 .state = s,
>                 .skb = skb,
>         };
> +       unsigned int ret;
>
> -       return bpf_prog_run(prog, &ctx);
> +       migrate_disable();
> +       ret = bpf_prog_run(prog, &ctx);
> +       migrate_enable();

The fix looks correct, but we need to root cause it better.
Why did it start now ?
BPF_F_NETFILTER_IP_DEFRAG was there for two years.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1 bpf] bpf: Disable migration in nf_hook_run_bpf().
  2025-07-17 20:54 ` Alexei Starovoitov
@ 2025-07-17 21:26   ` Florian Westphal
  2025-07-17 21:41     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2025-07-17 21:26 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Kuniyuki Iwashima, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Daniel Xu, Pablo Neira Ayuso, Jozsef Kadlecsik,
	Kuniyuki Iwashima, bpf, Network Development, netfilter-devel,
	syzbot+40f772d37250b6d10efc

Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > Let's call migrate_disable() before calling bpf_prog_run() in
> > nf_hook_run_bpf().

Or use bpf_prog_run_pin_on_cpu() which wraps bpf_prog_run().

> > Fixes: 91721c2d02d3 ("netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link")
> 
> Fixes tag looks wrong.
> I don't think it's Daniel's defrag series.
> No idea why syzbot bisected it to this commit.

Didn't check but I'd wager the bpf prog attach is rejected due to an
unsupported flag before this commit.  Looks like correct tag is

Fixes: fd9c663b9ad6 ("bpf: minimal support for programs hooked into netfilter framework")

I don't see anything that implicitly disables preemption and even 6.4 has
the cant_migrate() call there.

> > +       unsigned int ret;
> >
> > -       return bpf_prog_run(prog, &ctx);
> > +       migrate_disable();
> > +       ret = bpf_prog_run(prog, &ctx);
> > +       migrate_enable();
> 
> The fix looks correct, but we need to root cause it better.
> Why did it start now ?

I guess most people don't have preemptible rcu enabled.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1 bpf] bpf: Disable migration in nf_hook_run_bpf().
  2025-07-17 21:26   ` Florian Westphal
@ 2025-07-17 21:41     ` Kuniyuki Iwashima
  0 siblings, 0 replies; 4+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-17 21:41 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Daniel Xu, Pablo Neira Ayuso, Jozsef Kadlecsik,
	Kuniyuki Iwashima, bpf, Network Development, netfilter-devel,
	syzbot+40f772d37250b6d10efc

On Thu, Jul 17, 2025 at 2:26 PM Florian Westphal <fw@strlen.de> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > Let's call migrate_disable() before calling bpf_prog_run() in
> > > nf_hook_run_bpf().
>
> Or use bpf_prog_run_pin_on_cpu() which wraps bpf_prog_run().

Thanks, this is cleaner.

>
> > > Fixes: 91721c2d02d3 ("netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link")
> >
> > Fixes tag looks wrong.
> > I don't think it's Daniel's defrag series.
> > No idea why syzbot bisected it to this commit.
>
> Didn't check but I'd wager the bpf prog attach is rejected due to an
> unsupported flag before this commit.  Looks like correct tag is
>
> Fixes: fd9c663b9ad6 ("bpf: minimal support for programs hooked into netfilter framework")

Sorry, I should've checked closely.  This tag looks correct.


>
> I don't see anything that implicitly disables preemption and even 6.4 has
> the cant_migrate() call there.
>
> > > +       unsigned int ret;
> > >
> > > -       return bpf_prog_run(prog, &ctx);
> > > +       migrate_disable();
> > > +       ret = bpf_prog_run(prog, &ctx);
> > > +       migrate_enable();
> >
> > The fix looks correct, but we need to root cause it better.
> > Why did it start now ?
>
> I guess most people don't have preemptible rcu enabled.

I have no idea why syzbot found it now, at least it has
supported the netfilter prog since 2023 too.

commit d966708639b67fe767995dfab47bf4296201993f
Author: Paul Chaignon <paul.chaignon@gmail.com>
Date:   Wed Sep 6 13:38:44 2023

    sys/linux: cover BPF links for BPF netfilter programs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-07-17 21:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-17 18:58 [PATCH v1 bpf] bpf: Disable migration in nf_hook_run_bpf() Kuniyuki Iwashima
2025-07-17 20:54 ` Alexei Starovoitov
2025-07-17 21:26   ` Florian Westphal
2025-07-17 21:41     ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox