* [PATCH net] bpf: Don't redirect too small packets
@ 2024-03-22 12:24 Eric Dumazet
2024-03-22 14:10 ` patchwork-bot+netdevbpf
0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2024-03-22 12:24 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni, Daniel Borkmann,
Alexei Starovoitov, Martin KaFai Lau
Cc: Andrii Nakryiko, netdev, bpf, eric.dumazet, Eric Dumazet,
syzbot+9e27778c0edc62cb97d8, Stanislav Fomichev, Willem de Bruijn
Some drivers ndo_start_xmit() expect a minimal size, as shown
by various syzbot reports [1].
Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len")
the missing attribute that can be used by upper layers.
We need to use it in __bpf_redirect_common().
[1]
BUG: KMSAN: uninit-value in erspan_build_header+0x170/0x2f0 include/net/erspan.h:197
erspan_build_header+0x170/0x2f0 include/net/erspan.h:197
erspan_xmit+0x128a/0x1ec0 net/ipv4/ip_gre.c:706
__netdev_start_xmit include/linux/netdevice.h:4903 [inline]
netdev_start_xmit include/linux/netdevice.h:4917 [inline]
xmit_one net/core/dev.c:3531 [inline]
dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547
sch_direct_xmit+0x3c5/0xd50 net/sched/sch_generic.c:343
__dev_xmit_skb net/core/dev.c:3760 [inline]
__dev_queue_xmit+0x2e6a/0x52c0 net/core/dev.c:4301
dev_queue_xmit include/linux/netdevice.h:3091 [inline]
__bpf_tx_skb net/core/filter.c:2136 [inline]
__bpf_redirect_common net/core/filter.c:2180 [inline]
__bpf_redirect+0x14a6/0x1620 net/core/filter.c:2187
____bpf_clone_redirect net/core/filter.c:2460 [inline]
bpf_clone_redirect+0x328/0x470 net/core/filter.c:2432
___bpf_prog_run+0x13fe/0xe0f0 kernel/bpf/core.c:1997
__bpf_prog_run512+0xb5/0xe0 kernel/bpf/core.c:2238
bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
__bpf_prog_run include/linux/filter.h:657 [inline]
bpf_prog_run include/linux/filter.h:664 [inline]
bpf_test_run+0x499/0xc30 net/bpf/test_run.c:425
bpf_prog_test_run_skb+0x14ea/0x1f20 net/bpf/test_run.c:1058
bpf_prog_test_run+0x6b7/0xad0 kernel/bpf/syscall.c:4240
__sys_bpf+0x6aa/0xd90 kernel/bpf/syscall.c:5649
__do_sys_bpf kernel/bpf/syscall.c:5738 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5736 [inline]
__x64_sys_bpf+0xa0/0xe0 kernel/bpf/syscall.c:5736
do_syscall_64+0xd5/0x1f0
entry_SYSCALL_64_after_hwframe+0x6d/0x75
Uninit was created at:
slab_post_alloc_hook mm/slub.c:3804 [inline]
slab_alloc_node mm/slub.c:3845 [inline]
kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888
kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577
pskb_expand_head+0x222/0x19d0 net/core/skbuff.c:2245
__skb_cow include/linux/skbuff.h:3671 [inline]
skb_cow_head include/linux/skbuff.h:3705 [inline]
erspan_xmit+0xb08/0x1ec0 net/ipv4/ip_gre.c:692
__netdev_start_xmit include/linux/netdevice.h:4903 [inline]
netdev_start_xmit include/linux/netdevice.h:4917 [inline]
xmit_one net/core/dev.c:3531 [inline]
dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547
sch_direct_xmit+0x3c5/0xd50 net/sched/sch_generic.c:343
__dev_xmit_skb net/core/dev.c:3760 [inline]
__dev_queue_xmit+0x2e6a/0x52c0 net/core/dev.c:4301
dev_queue_xmit include/linux/netdevice.h:3091 [inline]
__bpf_tx_skb net/core/filter.c:2136 [inline]
__bpf_redirect_common net/core/filter.c:2180 [inline]
__bpf_redirect+0x14a6/0x1620 net/core/filter.c:2187
____bpf_clone_redirect net/core/filter.c:2460 [inline]
bpf_clone_redirect+0x328/0x470 net/core/filter.c:2432
___bpf_prog_run+0x13fe/0xe0f0 kernel/bpf/core.c:1997
__bpf_prog_run512+0xb5/0xe0 kernel/bpf/core.c:2238
bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
__bpf_prog_run include/linux/filter.h:657 [inline]
bpf_prog_run include/linux/filter.h:664 [inline]
bpf_test_run+0x499/0xc30 net/bpf/test_run.c:425
bpf_prog_test_run_skb+0x14ea/0x1f20 net/bpf/test_run.c:1058
bpf_prog_test_run+0x6b7/0xad0 kernel/bpf/syscall.c:4240
__sys_bpf+0x6aa/0xd90 kernel/bpf/syscall.c:5649
__do_sys_bpf kernel/bpf/syscall.c:5738 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5736 [inline]
__x64_sys_bpf+0xa0/0xe0 kernel/bpf/syscall.c:5736
do_syscall_64+0xd5/0x1f0
entry_SYSCALL_64_after_hwframe+0x6d/0x75
CPU: 0 PID: 5041 Comm: syz-executor167 Not tainted 6.8.0-syzkaller-11743-ga4145ce1e7bc #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len")
Reported-by: syzbot+9e27778c0edc62cb97d8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/000000000000205af206143ece22@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Willem de Bruijn <willemb@google.com>
---
net/core/filter.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/net/core/filter.c b/net/core/filter.c
index 8adf95765cdd967a15b2661dfb454db0ccf350b0..745697c08acb3a74721d26ee93389efa81e973a0 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2175,6 +2175,11 @@ static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev,
return -ERANGE;
}
+ if (unlikely(skb->len < dev->min_header_len)) {
+ kfree_skb(skb);
+ return -ERANGE;
+ }
+
bpf_push_mac_rcsum(skb);
return flags & BPF_F_INGRESS ?
__bpf_rx_skb(dev, skb) : __bpf_tx_skb(dev, skb);
--
2.44.0.396.g6e790dbe36-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-22 12:24 [PATCH net] bpf: Don't redirect too small packets Eric Dumazet @ 2024-03-22 14:10 ` patchwork-bot+netdevbpf 2024-03-23 3:01 ` Alexei Starovoitov 0 siblings, 1 reply; 13+ messages in thread From: patchwork-bot+netdevbpf @ 2024-03-22 14:10 UTC (permalink / raw) To: Eric Dumazet Cc: davem, kuba, pabeni, daniel, ast, martin.lau, andrii, netdev, bpf, eric.dumazet, syzbot+9e27778c0edc62cb97d8, sdf, willemb Hello: This patch was applied to bpf/bpf.git (master) by Daniel Borkmann <daniel@iogearbox.net>: On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: > Some drivers ndo_start_xmit() expect a minimal size, as shown > by various syzbot reports [1]. > > Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") > the missing attribute that can be used by upper layers. > > We need to use it in __bpf_redirect_common(). > > [...] Here is the summary with links: - [net] bpf: Don't redirect too small packets https://git.kernel.org/bpf/bpf/c/96b0e83341e8 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-22 14:10 ` patchwork-bot+netdevbpf @ 2024-03-23 3:01 ` Alexei Starovoitov 2024-03-25 13:33 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: Alexei Starovoitov @ 2024-03-23 3:01 UTC (permalink / raw) To: patchwork-bot+netdevbpf Cc: Eric Dumazet, David S. Miller, Jakub Kicinski, Paolo Abeni, Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Stanislav Fomichev, Willem de Bruijn On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: > > Hello: > > This patch was applied to bpf/bpf.git (master) > by Daniel Borkmann <daniel@iogearbox.net>: > > On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: > > Some drivers ndo_start_xmit() expect a minimal size, as shown > > by various syzbot reports [1]. > > > > Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") > > the missing attribute that can be used by upper layers. > > > > We need to use it in __bpf_redirect_common(). This patch broke empty_skb test: $ test_progs -t empty_skb test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress [redirect_ingress]: actual -34 != expected 0 test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress [redirect_egress]: actual -34 != expected 1 And looking at the test I think it's not a test issue. This check if (unlikely(skb->len < dev->min_header_len)) is rejecting more than it should. So I reverted this patch for now. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-23 3:01 ` Alexei Starovoitov @ 2024-03-25 13:33 ` Eric Dumazet 2024-03-25 15:47 ` Daniel Borkmann 2024-03-25 15:56 ` Alexei Starovoitov 0 siblings, 2 replies; 13+ messages in thread From: Eric Dumazet @ 2024-03-25 13:33 UTC (permalink / raw) To: Alexei Starovoitov, Guillaume Nault Cc: patchwork-bot+netdevbpf, David S. Miller, Jakub Kicinski, Paolo Abeni, Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Stanislav Fomichev, Willem de Bruijn On Sat, Mar 23, 2024 at 4:02 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: > > > > Hello: > > > > This patch was applied to bpf/bpf.git (master) > > by Daniel Borkmann <daniel@iogearbox.net>: > > > > On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: > > > Some drivers ndo_start_xmit() expect a minimal size, as shown > > > by various syzbot reports [1]. > > > > > > Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") > > > the missing attribute that can be used by upper layers. > > > > > > We need to use it in __bpf_redirect_common(). > > This patch broke empty_skb test: > $ test_progs -t empty_skb > > test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress > [redirect_ingress]: actual -34 != expected 0 > test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec > test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > [redirect_egress]: actual -34 != expected 1 > > And looking at the test I think it's not a test issue. > This check > if (unlikely(skb->len < dev->min_header_len)) > is rejecting more than it should. > > So I reverted this patch for now. OK, it seems I missed __bpf_rx_skb() vs __bpf_tx_skb(), but even if I move my sanity test in __bpf_tx_skb(), the bpf test program still fails, I am suspecting the test needs to be adjusted. diff --git a/net/core/filter.c b/net/core/filter.c index 745697c08acb3a74721d26ee93389efa81e973a0..e9c0e2087a08f1d8afd2c3e8e7871ddc9231b76d 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2128,6 +2128,12 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb) return -ENETDOWN; } + if (unlikely(skb->len < dev->min_header_len)) { + pr_err_once("__bpf_tx_skb skb->len=%u < dev(%s)->min_header_len(%u)\n", skb->len, dev->name, dev->min_header_len); + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); + kfree_skb(skb); + return -ERANGE; + } // Note: this is before we change skb->dev skb->dev = dev; skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb)); skb_clear_tstamp(skb); --> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress [redirect_egress]: actual -34 != expected 1 [ 58.382051] __bpf_tx_skb skb->len=1 < dev(veth0)->min_header_len(14) [ 58.382778] skb len=1 headroom=78 headlen=1 tailroom=113 mac=(64,14) net=(78,-1) trans=-1 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) hash(0x0 sw=0 l4=0) proto=0x7f00 pkttype=0 iif=0 Note that veth driver is one of the few 'Ethernet' drivers that make sure to get at least 14 bytes in the skb at ndo_start_xmit() after commit 726e2c5929de841fdcef4e2bf995680688ae1b87 ("veth: Ensure eth header is in skb's linear part") BTW this last patch (changing veth) should have been done generically from act_mirred (We do not want to patch ~400 ethernet drivers in the tree) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-25 13:33 ` Eric Dumazet @ 2024-03-25 15:47 ` Daniel Borkmann 2024-03-25 15:56 ` Alexei Starovoitov 1 sibling, 0 replies; 13+ messages in thread From: Daniel Borkmann @ 2024-03-25 15:47 UTC (permalink / raw) To: Eric Dumazet, Alexei Starovoitov, Guillaume Nault Cc: patchwork-bot+netdevbpf, David S. Miller, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Stanislav Fomichev, Willem de Bruijn On 3/25/24 2:33 PM, Eric Dumazet wrote: > On Sat, Mar 23, 2024 at 4:02 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: >> On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: >>> >>> Hello: >>> >>> This patch was applied to bpf/bpf.git (master) >>> by Daniel Borkmann <daniel@iogearbox.net>: >>> >>> On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: >>>> Some drivers ndo_start_xmit() expect a minimal size, as shown >>>> by various syzbot reports [1]. >>>> >>>> Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") >>>> the missing attribute that can be used by upper layers. >>>> >>>> We need to use it in __bpf_redirect_common(). >> >> This patch broke empty_skb test: >> $ test_progs -t empty_skb >> >> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress >> [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress >> [redirect_ingress]: actual -34 != expected 0 >> test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec >> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress >> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress >> [redirect_egress]: actual -34 != expected 1 >> >> And looking at the test I think it's not a test issue. >> This check >> if (unlikely(skb->len < dev->min_header_len)) >> is rejecting more than it should. >> >> So I reverted this patch for now. > > OK, it seems I missed __bpf_rx_skb() vs __bpf_tx_skb(), but even if I > move my sanity test in __bpf_tx_skb(), > the bpf test program still fails, I am suspecting the test needs to be adjusted. Let me take a look, I do think so too that we'd need to adjust the test. I'll see to have a patch for the latter so that we can reapply the fix as-is along with it given it's correct. > diff --git a/net/core/filter.c b/net/core/filter.c > index 745697c08acb3a74721d26ee93389efa81e973a0..e9c0e2087a08f1d8afd2c3e8e7871ddc9231b76d > 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -2128,6 +2128,12 @@ static inline int __bpf_tx_skb(struct > net_device *dev, struct sk_buff *skb) > return -ENETDOWN; > } > > + if (unlikely(skb->len < dev->min_header_len)) { > + pr_err_once("__bpf_tx_skb skb->len=%u < > dev(%s)->min_header_len(%u)\n", skb->len, dev->name, > dev->min_header_len); > + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); > + kfree_skb(skb); > + return -ERANGE; > + } // Note: this is before we change skb->dev > skb->dev = dev; > skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb)); > skb_clear_tstamp(skb); > > > --> > > > test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > [redirect_egress]: actual -34 != expected 1 > > [ 58.382051] __bpf_tx_skb skb->len=1 < dev(veth0)->min_header_len(14) > [ 58.382778] skb len=1 headroom=78 headlen=1 tailroom=113 > mac=(64,14) net=(78,-1) trans=-1 > shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) > hash(0x0 sw=0 l4=0) proto=0x7f00 pkttype=0 iif=0 > > Note that veth driver is one of the few 'Ethernet' drivers that make > sure to get at least 14 bytes in the skb at ndo_start_xmit() > after commit 726e2c5929de841fdcef4e2bf995680688ae1b87 ("veth: Ensure > eth header is in skb's linear part") > > BTW this last patch (changing veth) should have been done generically > from act_mirred > > (We do not want to patch ~400 ethernet drivers in the tree) > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-25 13:33 ` Eric Dumazet 2024-03-25 15:47 ` Daniel Borkmann @ 2024-03-25 15:56 ` Alexei Starovoitov 2024-03-25 16:28 ` Stanislav Fomichev 1 sibling, 1 reply; 13+ messages in thread From: Alexei Starovoitov @ 2024-03-25 15:56 UTC (permalink / raw) To: Eric Dumazet, Stanislav Fomichev Cc: Guillaume Nault, patchwork-bot+netdevbpf, David S. Miller, Jakub Kicinski, Paolo Abeni, Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Willem de Bruijn On Mon, Mar 25, 2024 at 6:33 AM Eric Dumazet <edumazet@google.com> wrote: > > On Sat, Mar 23, 2024 at 4:02 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: > > > > > > Hello: > > > > > > This patch was applied to bpf/bpf.git (master) > > > by Daniel Borkmann <daniel@iogearbox.net>: > > > > > > On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: > > > > Some drivers ndo_start_xmit() expect a minimal size, as shown > > > > by various syzbot reports [1]. > > > > > > > > Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") > > > > the missing attribute that can be used by upper layers. > > > > > > > > We need to use it in __bpf_redirect_common(). > > > > This patch broke empty_skb test: > > $ test_progs -t empty_skb > > > > test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress > > [redirect_ingress]: actual -34 != expected 0 > > test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec > > test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > > [redirect_egress]: actual -34 != expected 1 > > > > And looking at the test I think it's not a test issue. > > This check > > if (unlikely(skb->len < dev->min_header_len)) > > is rejecting more than it should. > > > > So I reverted this patch for now. > > OK, it seems I missed __bpf_rx_skb() vs __bpf_tx_skb(), but even if I > move my sanity test in __bpf_tx_skb(), > the bpf test program still fails, I am suspecting the test needs to be adjusted. > > > > diff --git a/net/core/filter.c b/net/core/filter.c > index 745697c08acb3a74721d26ee93389efa81e973a0..e9c0e2087a08f1d8afd2c3e8e7871ddc9231b76d > 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -2128,6 +2128,12 @@ static inline int __bpf_tx_skb(struct > net_device *dev, struct sk_buff *skb) > return -ENETDOWN; > } > > + if (unlikely(skb->len < dev->min_header_len)) { > + pr_err_once("__bpf_tx_skb skb->len=%u < > dev(%s)->min_header_len(%u)\n", skb->len, dev->name, > dev->min_header_len); > + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); > + kfree_skb(skb); > + return -ERANGE; > + } // Note: this is before we change skb->dev > skb->dev = dev; > skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb)); > skb_clear_tstamp(skb); > > > --> > > > test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > [redirect_egress]: actual -34 != expected 1 > > [ 58.382051] __bpf_tx_skb skb->len=1 < dev(veth0)->min_header_len(14) > [ 58.382778] skb len=1 headroom=78 headlen=1 tailroom=113 > mac=(64,14) net=(78,-1) trans=-1 > shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) > hash(0x0 sw=0 l4=0) proto=0x7f00 pkttype=0 iif=0 Hmm. Something is off. That test creates 15 byte skb. It's not obvious to me how it got reduced to 1. Something stripped L2 header and the prog is trying to redirect such skb into veth that expects skb with L2 ? Stan, please take a look. Since you wrote that test. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-25 15:56 ` Alexei Starovoitov @ 2024-03-25 16:28 ` Stanislav Fomichev 2024-03-26 12:46 ` Daniel Borkmann 0 siblings, 1 reply; 13+ messages in thread From: Stanislav Fomichev @ 2024-03-25 16:28 UTC (permalink / raw) To: Alexei Starovoitov Cc: Eric Dumazet, Guillaume Nault, patchwork-bot+netdevbpf, David S. Miller, Jakub Kicinski, Paolo Abeni, Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Willem de Bruijn On 03/25, Alexei Starovoitov wrote: > On Mon, Mar 25, 2024 at 6:33 AM Eric Dumazet <edumazet@google.com> wrote: > > > > On Sat, Mar 23, 2024 at 4:02 AM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: > > > > > > > > Hello: > > > > > > > > This patch was applied to bpf/bpf.git (master) > > > > by Daniel Borkmann <daniel@iogearbox.net>: > > > > > > > > On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: > > > > > Some drivers ndo_start_xmit() expect a minimal size, as shown > > > > > by various syzbot reports [1]. > > > > > > > > > > Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") > > > > > the missing attribute that can be used by upper layers. > > > > > > > > > > We need to use it in __bpf_redirect_common(). > > > > > > This patch broke empty_skb test: > > > $ test_progs -t empty_skb > > > > > > test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > > [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress > > > [redirect_ingress]: actual -34 != expected 0 > > > test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec > > > test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > > [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > > > [redirect_egress]: actual -34 != expected 1 > > > > > > And looking at the test I think it's not a test issue. > > > This check > > > if (unlikely(skb->len < dev->min_header_len)) > > > is rejecting more than it should. > > > > > > So I reverted this patch for now. > > > > OK, it seems I missed __bpf_rx_skb() vs __bpf_tx_skb(), but even if I > > move my sanity test in __bpf_tx_skb(), > > the bpf test program still fails, I am suspecting the test needs to be adjusted. > > > > > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > index 745697c08acb3a74721d26ee93389efa81e973a0..e9c0e2087a08f1d8afd2c3e8e7871ddc9231b76d > > 100644 > > --- a/net/core/filter.c > > +++ b/net/core/filter.c > > @@ -2128,6 +2128,12 @@ static inline int __bpf_tx_skb(struct > > net_device *dev, struct sk_buff *skb) > > return -ENETDOWN; > > } > > > > + if (unlikely(skb->len < dev->min_header_len)) { > > + pr_err_once("__bpf_tx_skb skb->len=%u < > > dev(%s)->min_header_len(%u)\n", skb->len, dev->name, > > dev->min_header_len); > > + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); > > + kfree_skb(skb); > > + return -ERANGE; > > + } // Note: this is before we change skb->dev > > skb->dev = dev; > > skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb)); > > skb_clear_tstamp(skb); > > > > > > --> > > > > > > test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > > [redirect_egress]: actual -34 != expected 1 > > > > [ 58.382051] __bpf_tx_skb skb->len=1 < dev(veth0)->min_header_len(14) > > [ 58.382778] skb len=1 headroom=78 headlen=1 tailroom=113 > > mac=(64,14) net=(78,-1) trans=-1 > > shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > > csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) > > hash(0x0 sw=0 l4=0) proto=0x7f00 pkttype=0 iif=0 > > Hmm. Something is off. > That test creates 15 byte skb. > It's not obvious to me how it got reduced to 1. > Something stripped L2 header and the prog is trying to redirect > such skb into veth that expects skb with L2 ? > > Stan, > please take a look. > Since you wrote that test. Sure. Daniel wants to take a look on a separate thread, so we can sync up. Tentatively, seems like the failure is in the lwt path that does indeed drop the l2. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-25 16:28 ` Stanislav Fomichev @ 2024-03-26 12:46 ` Daniel Borkmann 2024-03-26 13:37 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: Daniel Borkmann @ 2024-03-26 12:46 UTC (permalink / raw) To: Stanislav Fomichev, Alexei Starovoitov Cc: Eric Dumazet, Guillaume Nault, patchwork-bot+netdevbpf, David S. Miller, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Willem de Bruijn On 3/25/24 5:28 PM, Stanislav Fomichev wrote: > On 03/25, Alexei Starovoitov wrote: >> On Mon, Mar 25, 2024 at 6:33 AM Eric Dumazet <edumazet@google.com> wrote: >>> >>> On Sat, Mar 23, 2024 at 4:02 AM Alexei Starovoitov >>> <alexei.starovoitov@gmail.com> wrote: >>>> >>>> On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: >>>>> >>>>> Hello: >>>>> >>>>> This patch was applied to bpf/bpf.git (master) >>>>> by Daniel Borkmann <daniel@iogearbox.net>: >>>>> >>>>> On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: >>>>>> Some drivers ndo_start_xmit() expect a minimal size, as shown >>>>>> by various syzbot reports [1]. >>>>>> >>>>>> Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") >>>>>> the missing attribute that can be used by upper layers. >>>>>> >>>>>> We need to use it in __bpf_redirect_common(). >>>> >>>> This patch broke empty_skb test: >>>> $ test_progs -t empty_skb >>>> >>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress >>>> [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress >>>> [redirect_ingress]: actual -34 != expected 0 >>>> test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec >>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress >>>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress >>>> [redirect_egress]: actual -34 != expected 1 >>>> >>>> And looking at the test I think it's not a test issue. >>>> This check >>>> if (unlikely(skb->len < dev->min_header_len)) >>>> is rejecting more than it should. >>>> >>>> So I reverted this patch for now. >>> >>> OK, it seems I missed __bpf_rx_skb() vs __bpf_tx_skb(), but even if I >>> move my sanity test in __bpf_tx_skb(), >>> the bpf test program still fails, I am suspecting the test needs to be adjusted. >>> >>> >>> >>> diff --git a/net/core/filter.c b/net/core/filter.c >>> index 745697c08acb3a74721d26ee93389efa81e973a0..e9c0e2087a08f1d8afd2c3e8e7871ddc9231b76d >>> 100644 >>> --- a/net/core/filter.c >>> +++ b/net/core/filter.c >>> @@ -2128,6 +2128,12 @@ static inline int __bpf_tx_skb(struct >>> net_device *dev, struct sk_buff *skb) >>> return -ENETDOWN; >>> } >>> >>> + if (unlikely(skb->len < dev->min_header_len)) { >>> + pr_err_once("__bpf_tx_skb skb->len=%u < >>> dev(%s)->min_header_len(%u)\n", skb->len, dev->name, >>> dev->min_header_len); >>> + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); >>> + kfree_skb(skb); >>> + return -ERANGE; >>> + } // Note: this is before we change skb->dev >>> skb->dev = dev; >>> skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb)); >>> skb_clear_tstamp(skb); >>> >>> >>> --> >>> >>> >>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress >>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress >>> [redirect_egress]: actual -34 != expected 1 >>> >>> [ 58.382051] __bpf_tx_skb skb->len=1 < dev(veth0)->min_header_len(14) >>> [ 58.382778] skb len=1 headroom=78 headlen=1 tailroom=113 >>> mac=(64,14) net=(78,-1) trans=-1 >>> shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) >>> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) >>> hash(0x0 sw=0 l4=0) proto=0x7f00 pkttype=0 iif=0 >> >> Hmm. Something is off. >> That test creates 15 byte skb. >> It's not obvious to me how it got reduced to 1. >> Something stripped L2 header and the prog is trying to redirect >> such skb into veth that expects skb with L2 ? >> >> Stan, >> please take a look. >> Since you wrote that test. > > Sure. Daniel wants to take a look on a separate thread, so we can sync > up. Tentatively, seems like the failure is in the lwt path that does > indeed drop the l2. If we'd change the test into the below, the tc and empty_skb tests pass. run_lwt_bpf() calls into skb_do_redirect() which has L2 stripped, and thus skb->len is 1 in this test. We do use skb_mac_header_len() also in other tc BPF helpers, so perhaps s/skb->len/skb_mac_header_len(skb)/ is the best way forward.. static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev, u32 flags) { /* Verify that a link layer header is carried */ if (unlikely(skb->mac_header >= skb->network_header || skb->len == 0)) { kfree_skb(skb); return -ERANGE; } if (unlikely(skb_mac_header_len(skb) < dev->min_header_len)) { kfree_skb(skb); return -ERANGE; } bpf_push_mac_rcsum(skb); return flags & BPF_F_INGRESS ? __bpf_rx_skb(dev, skb) : __bpf_tx_skb(dev, skb); } Thanks, Daniel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-26 12:46 ` Daniel Borkmann @ 2024-03-26 13:37 ` Eric Dumazet 2024-03-26 13:38 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: Eric Dumazet @ 2024-03-26 13:37 UTC (permalink / raw) To: Daniel Borkmann Cc: Stanislav Fomichev, Alexei Starovoitov, Guillaume Nault, patchwork-bot+netdevbpf, David S. Miller, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Willem de Bruijn On Tue, Mar 26, 2024 at 1:46 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 3/25/24 5:28 PM, Stanislav Fomichev wrote: > > On 03/25, Alexei Starovoitov wrote: > >> On Mon, Mar 25, 2024 at 6:33 AM Eric Dumazet <edumazet@google.com> wrote: > >>> > >>> On Sat, Mar 23, 2024 at 4:02 AM Alexei Starovoitov > >>> <alexei.starovoitov@gmail.com> wrote: > >>>> > >>>> On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: > >>>>> > >>>>> Hello: > >>>>> > >>>>> This patch was applied to bpf/bpf.git (master) > >>>>> by Daniel Borkmann <daniel@iogearbox.net>: > >>>>> > >>>>> On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: > >>>>>> Some drivers ndo_start_xmit() expect a minimal size, as shown > >>>>>> by various syzbot reports [1]. > >>>>>> > >>>>>> Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") > >>>>>> the missing attribute that can be used by upper layers. > >>>>>> > >>>>>> We need to use it in __bpf_redirect_common(). > >>>> > >>>> This patch broke empty_skb test: > >>>> $ test_progs -t empty_skb > >>>> > >>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > >>>> [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress > >>>> [redirect_ingress]: actual -34 != expected 0 > >>>> test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec > >>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > >>>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > >>>> [redirect_egress]: actual -34 != expected 1 > >>>> > >>>> And looking at the test I think it's not a test issue. > >>>> This check > >>>> if (unlikely(skb->len < dev->min_header_len)) > >>>> is rejecting more than it should. > >>>> > >>>> So I reverted this patch for now. > >>> > >>> OK, it seems I missed __bpf_rx_skb() vs __bpf_tx_skb(), but even if I > >>> move my sanity test in __bpf_tx_skb(), > >>> the bpf test program still fails, I am suspecting the test needs to be adjusted. > >>> > >>> > >>> > >>> diff --git a/net/core/filter.c b/net/core/filter.c > >>> index 745697c08acb3a74721d26ee93389efa81e973a0..e9c0e2087a08f1d8afd2c3e8e7871ddc9231b76d > >>> 100644 > >>> --- a/net/core/filter.c > >>> +++ b/net/core/filter.c > >>> @@ -2128,6 +2128,12 @@ static inline int __bpf_tx_skb(struct > >>> net_device *dev, struct sk_buff *skb) > >>> return -ENETDOWN; > >>> } > >>> > >>> + if (unlikely(skb->len < dev->min_header_len)) { > >>> + pr_err_once("__bpf_tx_skb skb->len=%u < > >>> dev(%s)->min_header_len(%u)\n", skb->len, dev->name, > >>> dev->min_header_len); > >>> + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); > >>> + kfree_skb(skb); > >>> + return -ERANGE; > >>> + } // Note: this is before we change skb->dev > >>> skb->dev = dev; > >>> skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb)); > >>> skb_clear_tstamp(skb); > >>> > >>> > >>> --> > >>> > >>> > >>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > >>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > >>> [redirect_egress]: actual -34 != expected 1 > >>> > >>> [ 58.382051] __bpf_tx_skb skb->len=1 < dev(veth0)->min_header_len(14) > >>> [ 58.382778] skb len=1 headroom=78 headlen=1 tailroom=113 > >>> mac=(64,14) net=(78,-1) trans=-1 > >>> shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > >>> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) > >>> hash(0x0 sw=0 l4=0) proto=0x7f00 pkttype=0 iif=0 > >> > >> Hmm. Something is off. > >> That test creates 15 byte skb. > >> It's not obvious to me how it got reduced to 1. > >> Something stripped L2 header and the prog is trying to redirect > >> such skb into veth that expects skb with L2 ? > >> > >> Stan, > >> please take a look. > >> Since you wrote that test. > > > > Sure. Daniel wants to take a look on a separate thread, so we can sync > > up. Tentatively, seems like the failure is in the lwt path that does > > indeed drop the l2. > > If we'd change the test into the below, the tc and empty_skb tests pass. > run_lwt_bpf() calls into skb_do_redirect() which has L2 stripped, and thus > skb->len is 1 in this test. We do use skb_mac_header_len() also in other > tc BPF helpers, so perhaps s/skb->len/skb_mac_header_len(skb)/ is the best > way forward.. > > static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev, > u32 flags) > { > /* Verify that a link layer header is carried */ > if (unlikely(skb->mac_header >= skb->network_header || skb->len == 0)) { > kfree_skb(skb); > return -ERANGE; > } > > if (unlikely(skb_mac_header_len(skb) < dev->min_header_len)) { Unfortunately this will not prevent frames with skb->len == 1 to reach an Ethernet driver ndo_start_xmit() At ndo_start_xmit(), we do not look where the MAC header supposedly starts in the skb, we only use skb->data I have a syzbot repro using team driver, so I added the following part in team : diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index 0a44bbdcfb7b9f30a0c27b700246501c5eba322f..75e5ef585a8f05b35cfddbae0bfc377864e6e38c 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -1714,6 +1714,11 @@ static netdev_tx_t team_xmit(struct sk_buff *skb, struct net_device *dev) bool tx_success; unsigned int len = skb->len; + if (len < 14) { + pr_err_once("team_xmit(len=%u)\n", len); + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); + WARN_ON_ONCE(1); + } tx_success = team_queue_override_transmit(team, skb); if (!tx_success) tx_success = team->ops.transmit(team, skb); And I get (with your suggestion instead of skb->len) mac=(78,0) net=(78,-1) trans=-1 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) hash(0x0 sw=0 l4=0) proto=0x88a8 pkttype=3 iif=0 [ 41.126553] dev name=team0 feat=0x0000e0064fddfbe9 [ 41.127132] skb linear: 00000000: 55 [ 41.128487] ------------[ cut here ]------------ [ 41.128551] WARNING: CPU: 2 PID: 1880 at drivers/net/team/team.c:1720 team_xmit (drivers/net/team/team.c:1720 (discriminator 1)) [ 41.129072] Modules linked in: macsec macvtap macvlan hsr wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 libchacha poly1305_x86_64 batman_adv dummy bridge sr_mod cdrom evdev pcspkr i2c_piix4 9pnet_virtio 9p netfs 9pnet [ 41.129613] CPU: 2 PID: 1880 Comm: b330650301 Not tainted 6.8.0-virtme #238 [ 41.129664] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 41.129780] RIP: 0010:team_xmit (drivers/net/team/team.c:1720 (discriminator 1)) [ 41.129847] Code: 41 54 53 44 8b 7f 70 48 89 fb 41 83 ff 0d 77 1c 80 3d a0 24 8d 01 00 0f 84 0d 01 00 00 80 3d 92 24 8d 01 00 0f 84 e3 00 00 00 <0f> 0b 41 80 be 21 0b 00 00 00 0f 84 9d 00 00 00 0f b7 43 7c 66 85 All code ======== 0: 41 54 push %r12 2: 53 push %rbx 3: 44 8b 7f 70 mov 0x70(%rdi),%r15d 7: 48 89 fb mov %rdi,%rbx a: 41 83 ff 0d cmp $0xd,%r15d e: 77 1c ja 0x2c 10: 80 3d a0 24 8d 01 00 cmpb $0x0,0x18d24a0(%rip) # 0x18d24b7 17: 0f 84 0d 01 00 00 je 0x12a 1d: 80 3d 92 24 8d 01 00 cmpb $0x0,0x18d2492(%rip) # 0x18d24b6 24: 0f 84 e3 00 00 00 je 0x10d 2a:* 0f 0b ud2 <-- trapping instruction 2c: 41 80 be 21 0b 00 00 cmpb $0x0,0xb21(%r14) 33: 00 34: 0f 84 9d 00 00 00 je 0xd7 3a: 0f b7 43 7c movzwl 0x7c(%rbx),%eax 3e: 66 data16 3f: 85 .byte 0x85 Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 41 80 be 21 0b 00 00 cmpb $0x0,0xb21(%r14) 9: 00 a: 0f 84 9d 00 00 00 je 0xad 10: 0f b7 43 7c movzwl 0x7c(%rbx),%eax 14: 66 data16 15: 85 .byte 0x85 [ 41.129902] RSP: 0018:ffffa4210433b938 EFLAGS: 00000246 [ 41.129945] RAX: 0000000000000000 RBX: ffffa4210858a300 RCX: 0000000000000000 [ 41.129961] RDX: 0000000000000000 RSI: 00000000ffff7fff RDI: 0000000000000001 [ 41.129975] RBP: ffffa4210433b960 R08: 0000000000000000 R09: ffffa4210433b630 [ 41.129989] R10: 0000000000000001 R11: ffffffff8407d340 R12: 0000000000000000 [ 41.130004] R13: ffffa4210ecee000 R14: ffffa4210ece4000 R15: 0000000000000001 [ 41.130074] FS: 00007f91d9549740(0000) GS:ffffa42fffa80000(0000) knlGS:0000000000000000 [ 41.130095] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 41.130140] CR2: 00007f8953077fb0 CR3: 0000000104f42000 CR4: 00000000000006f0 [ 41.130229] Call Trace: [ 41.130331] <TASK> [ 41.130530] ? show_regs (arch/x86/kernel/dumpstack.c:479) [ 41.130598] ? __warn (kernel/panic.c:694) [ 41.130611] ? team_xmit (drivers/net/team/team.c:1720 (discriminator 1)) [ 41.130625] ? report_bug (lib/bug.c:180 lib/bug.c:219) [ 41.130640] ? handle_bug (arch/x86/kernel/traps.c:239) [ 41.130653] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) [ 41.130665] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621) [ 41.130700] ? team_xmit (drivers/net/team/team.c:1720 (discriminator 1)) [ 41.130714] ? team_xmit (drivers/net/team/team.c:1719 (discriminator 6)) [ 41.130734] dev_hard_start_xmit (./include/linux/netdevice.h:4903 ./include/linux/netdevice.h:4917 net/core/dev.c:3531 net/core/dev.c:3547) [ 41.130768] __dev_queue_xmit (./include/linux/netdevice.h:3287 (discriminator 25) net/core/dev.c:4336 (discriminator 25)) [ 41.130780] ? kmalloc_reserve (net/core/skbuff.c:580 (discriminator 4)) [ 41.130796] ? pskb_expand_head (net/core/skbuff.c:2292) [ 41.130815] __bpf_redirect (./include/linux/netdevice.h:3287 (discriminator 25) net/core/filter.c:2143 (discriminator 25) net/core/filter.c:2172 (discriminator 25) net/core/filter.c:2196 (discriminator 25)) [ 41.130825] bpf_clone_redirect (net/core/filter.c:2467 (discriminator 1) net/core/filter.c:2439 (discriminator 1)) [ 41.130841] bpf_prog_9845f5eee09e82c6+0x61/0x66 [ 41.130948] ? bpf_ksym_find (./include/linux/rbtree_latch.h:113 ./include/linux/rbtree_latch.h:208 kernel/bpf/core.c:734) [ 41.130963] ? is_bpf_text_address (./arch/x86/include/asm/preempt.h:84 (discriminator 13) ./include/linux/rcupdate.h:97 (discriminator 13) ./include/linux/rcupdate.h:813 (discriminator 13) kernel/bpf/core.c:769 (discriminator 13)) [ 41.130976] ? kernel_text_address (kernel/extable.c:125 (discriminator 1) kernel/extable.c:94 (discriminator 1)) [ 41.130989] ? __kernel_text_address (kernel/extable.c:79 (discriminator 1)) [ 41.131002] ? unwind_get_return_address (arch/x86/kernel/unwind_frame.c:19 (discriminator 1)) [ 41.131014] ? __pfx_stack_trace_consume_entry (kernel/stacktrace.c:83) [ 41.131028] ? arch_stack_walk (arch/x86/kernel/stacktrace.c:26) [ 41.131044] ? stack_depot_save_flags (lib/stackdepot.c:675) [ 41.131062] ? ktime_get (kernel/time/timekeeping.c:292 kernel/time/timekeeping.c:388 kernel/time/timekeeping.c:848) [ 41.131076] bpf_test_run (./include/linux/bpf.h:1234 ./include/linux/filter.h:657 ./include/linux/filter.h:664 net/bpf/test_run.c:425) [ 41.131087] ? security_sk_alloc (security/security.c:4662 (discriminator 13)) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-26 13:37 ` Eric Dumazet @ 2024-03-26 13:38 ` Eric Dumazet 2024-03-26 17:57 ` Daniel Borkmann 0 siblings, 1 reply; 13+ messages in thread From: Eric Dumazet @ 2024-03-26 13:38 UTC (permalink / raw) To: Daniel Borkmann Cc: Stanislav Fomichev, Alexei Starovoitov, Guillaume Nault, patchwork-bot+netdevbpf, David S. Miller, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Willem de Bruijn On Tue, Mar 26, 2024 at 2:37 PM Eric Dumazet <edumazet@google.com> wrote: > > On Tue, Mar 26, 2024 at 1:46 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > On 3/25/24 5:28 PM, Stanislav Fomichev wrote: > > > On 03/25, Alexei Starovoitov wrote: > > >> On Mon, Mar 25, 2024 at 6:33 AM Eric Dumazet <edumazet@google.com> wrote: > > >>> > > >>> On Sat, Mar 23, 2024 at 4:02 AM Alexei Starovoitov > > >>> <alexei.starovoitov@gmail.com> wrote: > > >>>> > > >>>> On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: > > >>>>> > > >>>>> Hello: > > >>>>> > > >>>>> This patch was applied to bpf/bpf.git (master) > > >>>>> by Daniel Borkmann <daniel@iogearbox.net>: > > >>>>> > > >>>>> On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: > > >>>>>> Some drivers ndo_start_xmit() expect a minimal size, as shown > > >>>>>> by various syzbot reports [1]. > > >>>>>> > > >>>>>> Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") > > >>>>>> the missing attribute that can be used by upper layers. > > >>>>>> > > >>>>>> We need to use it in __bpf_redirect_common(). > > >>>> > > >>>> This patch broke empty_skb test: > > >>>> $ test_progs -t empty_skb > > >>>> > > >>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > >>>> [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress > > >>>> [redirect_ingress]: actual -34 != expected 0 > > >>>> test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec > > >>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > >>>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > > >>>> [redirect_egress]: actual -34 != expected 1 > > >>>> > > >>>> And looking at the test I think it's not a test issue. > > >>>> This check > > >>>> if (unlikely(skb->len < dev->min_header_len)) > > >>>> is rejecting more than it should. > > >>>> > > >>>> So I reverted this patch for now. > > >>> > > >>> OK, it seems I missed __bpf_rx_skb() vs __bpf_tx_skb(), but even if I > > >>> move my sanity test in __bpf_tx_skb(), > > >>> the bpf test program still fails, I am suspecting the test needs to be adjusted. > > >>> > > >>> > > >>> > > >>> diff --git a/net/core/filter.c b/net/core/filter.c > > >>> index 745697c08acb3a74721d26ee93389efa81e973a0..e9c0e2087a08f1d8afd2c3e8e7871ddc9231b76d > > >>> 100644 > > >>> --- a/net/core/filter.c > > >>> +++ b/net/core/filter.c > > >>> @@ -2128,6 +2128,12 @@ static inline int __bpf_tx_skb(struct > > >>> net_device *dev, struct sk_buff *skb) > > >>> return -ENETDOWN; > > >>> } > > >>> > > >>> + if (unlikely(skb->len < dev->min_header_len)) { > > >>> + pr_err_once("__bpf_tx_skb skb->len=%u < > > >>> dev(%s)->min_header_len(%u)\n", skb->len, dev->name, > > >>> dev->min_header_len); > > >>> + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); > > >>> + kfree_skb(skb); > > >>> + return -ERANGE; > > >>> + } // Note: this is before we change skb->dev > > >>> skb->dev = dev; > > >>> skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb)); > > >>> skb_clear_tstamp(skb); > > >>> > > >>> > > >>> --> > > >>> > > >>> > > >>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > >>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > > >>> [redirect_egress]: actual -34 != expected 1 > > >>> > > >>> [ 58.382051] __bpf_tx_skb skb->len=1 < dev(veth0)->min_header_len(14) > > >>> [ 58.382778] skb len=1 headroom=78 headlen=1 tailroom=113 > > >>> mac=(64,14) net=(78,-1) trans=-1 > > >>> shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > > >>> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) > > >>> hash(0x0 sw=0 l4=0) proto=0x7f00 pkttype=0 iif=0 > > >> > > >> Hmm. Something is off. > > >> That test creates 15 byte skb. > > >> It's not obvious to me how it got reduced to 1. > > >> Something stripped L2 header and the prog is trying to redirect > > >> such skb into veth that expects skb with L2 ? > > >> > > >> Stan, > > >> please take a look. > > >> Since you wrote that test. > > > > > > Sure. Daniel wants to take a look on a separate thread, so we can sync > > > up. Tentatively, seems like the failure is in the lwt path that does > > > indeed drop the l2. > > > > If we'd change the test into the below, the tc and empty_skb tests pass. > > run_lwt_bpf() calls into skb_do_redirect() which has L2 stripped, and thus > > skb->len is 1 in this test. We do use skb_mac_header_len() also in other > > tc BPF helpers, so perhaps s/skb->len/skb_mac_header_len(skb)/ is the best > > way forward.. > > > > static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev, > > u32 flags) > > { > > /* Verify that a link layer header is carried */ > > if (unlikely(skb->mac_header >= skb->network_header || skb->len == 0)) { > > kfree_skb(skb); > > return -ERANGE; > > } > > > > if (unlikely(skb_mac_header_len(skb) < dev->min_header_len)) { > > Unfortunately this will not prevent frames with skb->len == 1 to reach > an Ethernet driver ndo_start_xmit() > > At ndo_start_xmit(), we do not look where the MAC header supposedly > starts in the skb, we only use skb->data > > I have a syzbot repro using team driver, so I added the following part in team : > > diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c > index 0a44bbdcfb7b9f30a0c27b700246501c5eba322f..75e5ef585a8f05b35cfddbae0bfc377864e6e38c > 100644 > --- a/drivers/net/team/team.c > +++ b/drivers/net/team/team.c > @@ -1714,6 +1714,11 @@ static netdev_tx_t team_xmit(struct sk_buff > *skb, struct net_device *dev) > bool tx_success; > unsigned int len = skb->len; > > + if (len < 14) { > + pr_err_once("team_xmit(len=%u)\n", len); > + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); > + WARN_ON_ONCE(1); > + } > tx_success = team_queue_override_transmit(team, skb); > if (!tx_success) > tx_success = team->ops.transmit(team, skb); > > > And I get (with your suggestion instead of skb->len) > Missing part in my copy/paste : [ 41.123829] team_xmit(len=1) [ 41.124335] skb len=1 headroom=78 headlen=1 tailroom=113 > mac=(78,0) net=(78,-1) trans=-1 > shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) > hash(0x0 sw=0 l4=0) proto=0x88a8 pkttype=3 iif=0 > [ 41.126553] dev name=team0 feat=0x0000e0064fddfbe9 > [ 41.127132] skb linear: 00000000: 55 > [ 41.128487] ------------[ cut here ]------------ > [ 41.128551] WARNING: CPU: 2 PID: 1880 at > drivers/net/team/team.c:1720 team_xmit (drivers/net/team/team.c:1720 > (discriminator 1)) > [ 41.129072] Modules linked in: macsec macvtap macvlan hsr wireguard > curve25519_x86_64 libcurve25519_generic libchacha20poly1305 > chacha_x86_64 libchacha poly1305_x86_64 batman_adv dummy bridge sr_mod > cdrom evdev pcspkr i2c_piix4 9pnet_virtio 9p netfs 9pnet > [ 41.129613] CPU: 2 PID: 1880 Comm: b330650301 Not tainted 6.8.0-virtme #238 > [ 41.129664] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.16.3-debian-1.16.3-2 04/01/2014 > [ 41.129780] RIP: 0010:team_xmit (drivers/net/team/team.c:1720 > (discriminator 1)) > [ 41.129847] Code: 41 54 53 44 8b 7f 70 48 89 fb 41 83 ff 0d 77 1c 80 > 3d a0 24 8d 01 00 0f 84 0d 01 00 00 80 3d 92 24 8d 01 00 0f 84 e3 00 > 00 00 <0f> 0b 41 80 be 21 0b 00 00 00 0f 84 9d 00 00 00 0f b7 43 7c 66 > 85 > All code > ======== > 0: 41 54 push %r12 > 2: 53 push %rbx > 3: 44 8b 7f 70 mov 0x70(%rdi),%r15d > 7: 48 89 fb mov %rdi,%rbx > a: 41 83 ff 0d cmp $0xd,%r15d > e: 77 1c ja 0x2c > 10: 80 3d a0 24 8d 01 00 cmpb $0x0,0x18d24a0(%rip) # 0x18d24b7 > 17: 0f 84 0d 01 00 00 je 0x12a > 1d: 80 3d 92 24 8d 01 00 cmpb $0x0,0x18d2492(%rip) # 0x18d24b6 > 24: 0f 84 e3 00 00 00 je 0x10d > 2a:* 0f 0b ud2 <-- trapping instruction > 2c: 41 80 be 21 0b 00 00 cmpb $0x0,0xb21(%r14) > 33: 00 > 34: 0f 84 9d 00 00 00 je 0xd7 > 3a: 0f b7 43 7c movzwl 0x7c(%rbx),%eax > 3e: 66 data16 > 3f: 85 .byte 0x85 > > Code starting with the faulting instruction > =========================================== > 0: 0f 0b ud2 > 2: 41 80 be 21 0b 00 00 cmpb $0x0,0xb21(%r14) > 9: 00 > a: 0f 84 9d 00 00 00 je 0xad > 10: 0f b7 43 7c movzwl 0x7c(%rbx),%eax > 14: 66 data16 > 15: 85 .byte 0x85 > [ 41.129902] RSP: 0018:ffffa4210433b938 EFLAGS: 00000246 > [ 41.129945] RAX: 0000000000000000 RBX: ffffa4210858a300 RCX: 0000000000000000 > [ 41.129961] RDX: 0000000000000000 RSI: 00000000ffff7fff RDI: 0000000000000001 > [ 41.129975] RBP: ffffa4210433b960 R08: 0000000000000000 R09: ffffa4210433b630 > [ 41.129989] R10: 0000000000000001 R11: ffffffff8407d340 R12: 0000000000000000 > [ 41.130004] R13: ffffa4210ecee000 R14: ffffa4210ece4000 R15: 0000000000000001 > [ 41.130074] FS: 00007f91d9549740(0000) GS:ffffa42fffa80000(0000) > knlGS:0000000000000000 > [ 41.130095] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 41.130140] CR2: 00007f8953077fb0 CR3: 0000000104f42000 CR4: 00000000000006f0 > [ 41.130229] Call Trace: > [ 41.130331] <TASK> > [ 41.130530] ? show_regs (arch/x86/kernel/dumpstack.c:479) > [ 41.130598] ? __warn (kernel/panic.c:694) > [ 41.130611] ? team_xmit (drivers/net/team/team.c:1720 (discriminator 1)) > [ 41.130625] ? report_bug (lib/bug.c:180 lib/bug.c:219) > [ 41.130640] ? handle_bug (arch/x86/kernel/traps.c:239) > [ 41.130653] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) > [ 41.130665] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621) > [ 41.130700] ? team_xmit (drivers/net/team/team.c:1720 (discriminator 1)) > [ 41.130714] ? team_xmit (drivers/net/team/team.c:1719 (discriminator 6)) > [ 41.130734] dev_hard_start_xmit (./include/linux/netdevice.h:4903 > ./include/linux/netdevice.h:4917 net/core/dev.c:3531 > net/core/dev.c:3547) > [ 41.130768] __dev_queue_xmit (./include/linux/netdevice.h:3287 > (discriminator 25) net/core/dev.c:4336 (discriminator 25)) > [ 41.130780] ? kmalloc_reserve (net/core/skbuff.c:580 (discriminator 4)) > [ 41.130796] ? pskb_expand_head (net/core/skbuff.c:2292) > [ 41.130815] __bpf_redirect (./include/linux/netdevice.h:3287 > (discriminator 25) net/core/filter.c:2143 (discriminator 25) > net/core/filter.c:2172 (discriminator 25) net/core/filter.c:2196 > (discriminator 25)) > [ 41.130825] bpf_clone_redirect (net/core/filter.c:2467 > (discriminator 1) net/core/filter.c:2439 (discriminator 1)) > [ 41.130841] bpf_prog_9845f5eee09e82c6+0x61/0x66 > [ 41.130948] ? bpf_ksym_find (./include/linux/rbtree_latch.h:113 > ./include/linux/rbtree_latch.h:208 kernel/bpf/core.c:734) > [ 41.130963] ? is_bpf_text_address > (./arch/x86/include/asm/preempt.h:84 (discriminator 13) > ./include/linux/rcupdate.h:97 (discriminator 13) > ./include/linux/rcupdate.h:813 (discriminator 13) > kernel/bpf/core.c:769 (discriminator 13)) > [ 41.130976] ? kernel_text_address (kernel/extable.c:125 > (discriminator 1) kernel/extable.c:94 (discriminator 1)) > [ 41.130989] ? __kernel_text_address (kernel/extable.c:79 (discriminator 1)) > [ 41.131002] ? unwind_get_return_address > (arch/x86/kernel/unwind_frame.c:19 (discriminator 1)) > [ 41.131014] ? __pfx_stack_trace_consume_entry (kernel/stacktrace.c:83) > [ 41.131028] ? arch_stack_walk (arch/x86/kernel/stacktrace.c:26) > [ 41.131044] ? stack_depot_save_flags (lib/stackdepot.c:675) > [ 41.131062] ? ktime_get (kernel/time/timekeeping.c:292 > kernel/time/timekeeping.c:388 kernel/time/timekeeping.c:848) > [ 41.131076] bpf_test_run (./include/linux/bpf.h:1234 > ./include/linux/filter.h:657 ./include/linux/filter.h:664 > net/bpf/test_run.c:425) > [ 41.131087] ? security_sk_alloc (security/security.c:4662 (discriminator 13)) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-26 13:38 ` Eric Dumazet @ 2024-03-26 17:57 ` Daniel Borkmann 2024-03-26 18:08 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: Daniel Borkmann @ 2024-03-26 17:57 UTC (permalink / raw) To: Eric Dumazet Cc: Stanislav Fomichev, Alexei Starovoitov, Guillaume Nault, patchwork-bot+netdevbpf, David S. Miller, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Willem de Bruijn On 3/26/24 2:38 PM, Eric Dumazet wrote: > On Tue, Mar 26, 2024 at 2:37 PM Eric Dumazet <edumazet@google.com> wrote: >> On Tue, Mar 26, 2024 at 1:46 PM Daniel Borkmann <daniel@iogearbox.net> wrote: >>> On 3/25/24 5:28 PM, Stanislav Fomichev wrote: >>>> On 03/25, Alexei Starovoitov wrote: >>>>> On Mon, Mar 25, 2024 at 6:33 AM Eric Dumazet <edumazet@google.com> wrote: >>>>>> On Sat, Mar 23, 2024 at 4:02 AM Alexei Starovoitov >>>>>> <alexei.starovoitov@gmail.com> wrote: >>>>>>> On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: >>>>>>>> >>>>>>>> Hello: >>>>>>>> >>>>>>>> This patch was applied to bpf/bpf.git (master) >>>>>>>> by Daniel Borkmann <daniel@iogearbox.net>: >>>>>>>> >>>>>>>> On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: >>>>>>>>> Some drivers ndo_start_xmit() expect a minimal size, as shown >>>>>>>>> by various syzbot reports [1]. >>>>>>>>> >>>>>>>>> Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") >>>>>>>>> the missing attribute that can be used by upper layers. >>>>>>>>> >>>>>>>>> We need to use it in __bpf_redirect_common(). >>>>>>> >>>>>>> This patch broke empty_skb test: >>>>>>> $ test_progs -t empty_skb >>>>>>> >>>>>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress >>>>>>> [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress >>>>>>> [redirect_ingress]: actual -34 != expected 0 >>>>>>> test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec >>>>>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress >>>>>>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress >>>>>>> [redirect_egress]: actual -34 != expected 1 >>>>>>> >>>>>>> And looking at the test I think it's not a test issue. >>>>>>> This check >>>>>>> if (unlikely(skb->len < dev->min_header_len)) >>>>>>> is rejecting more than it should. >>>>>>> >>>>>>> So I reverted this patch for now. >>>>>> >>>>>> OK, it seems I missed __bpf_rx_skb() vs __bpf_tx_skb(), but even if I >>>>>> move my sanity test in __bpf_tx_skb(), >>>>>> the bpf test program still fails, I am suspecting the test needs to be adjusted. >>>>>> >>>>>> diff --git a/net/core/filter.c b/net/core/filter.c >>>>>> index 745697c08acb3a74721d26ee93389efa81e973a0..e9c0e2087a08f1d8afd2c3e8e7871ddc9231b76d >>>>>> 100644 >>>>>> --- a/net/core/filter.c >>>>>> +++ b/net/core/filter.c >>>>>> @@ -2128,6 +2128,12 @@ static inline int __bpf_tx_skb(struct >>>>>> net_device *dev, struct sk_buff *skb) >>>>>> return -ENETDOWN; >>>>>> } >>>>>> >>>>>> + if (unlikely(skb->len < dev->min_header_len)) { >>>>>> + pr_err_once("__bpf_tx_skb skb->len=%u < >>>>>> dev(%s)->min_header_len(%u)\n", skb->len, dev->name, >>>>>> dev->min_header_len); >>>>>> + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); >>>>>> + kfree_skb(skb); >>>>>> + return -ERANGE; >>>>>> + } // Note: this is before we change skb->dev >>>>>> skb->dev = dev; >>>>>> skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb)); >>>>>> skb_clear_tstamp(skb); >>>>>> >>>>>> >>>>>> --> >>>>>> >>>>>> >>>>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress >>>>>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress >>>>>> [redirect_egress]: actual -34 != expected 1 >>>>>> >>>>>> [ 58.382051] __bpf_tx_skb skb->len=1 < dev(veth0)->min_header_len(14) >>>>>> [ 58.382778] skb len=1 headroom=78 headlen=1 tailroom=113 >>>>>> mac=(64,14) net=(78,-1) trans=-1 >>>>>> shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) >>>>>> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) >>>>>> hash(0x0 sw=0 l4=0) proto=0x7f00 pkttype=0 iif=0 >>>>> >>>>> Hmm. Something is off. >>>>> That test creates 15 byte skb. >>>>> It's not obvious to me how it got reduced to 1. >>>>> Something stripped L2 header and the prog is trying to redirect >>>>> such skb into veth that expects skb with L2 ? >>>>> >>>>> Stan, >>>>> please take a look. >>>>> Since you wrote that test. >>>> >>>> Sure. Daniel wants to take a look on a separate thread, so we can sync >>>> up. Tentatively, seems like the failure is in the lwt path that does >>>> indeed drop the l2. >>> >>> If we'd change the test into the below, the tc and empty_skb tests pass. >>> run_lwt_bpf() calls into skb_do_redirect() which has L2 stripped, and thus >>> skb->len is 1 in this test. We do use skb_mac_header_len() also in other >>> tc BPF helpers, so perhaps s/skb->len/skb_mac_header_len(skb)/ is the best >>> way forward.. >>> >>> static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev, >>> u32 flags) >>> { >>> /* Verify that a link layer header is carried */ >>> if (unlikely(skb->mac_header >= skb->network_header || skb->len == 0)) { >>> kfree_skb(skb); >>> return -ERANGE; >>> } >>> >>> if (unlikely(skb_mac_header_len(skb) < dev->min_header_len)) { >> >> Unfortunately this will not prevent frames with skb->len == 1 to reach >> an Ethernet driver ndo_start_xmit() >> >> At ndo_start_xmit(), we do not look where the MAC header supposedly >> starts in the skb, we only use skb->data >> >> I have a syzbot repro using team driver, so I added the following part in team : >> >> diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c >> index 0a44bbdcfb7b9f30a0c27b700246501c5eba322f..75e5ef585a8f05b35cfddbae0bfc377864e6e38c >> 100644 >> --- a/drivers/net/team/team.c >> +++ b/drivers/net/team/team.c >> @@ -1714,6 +1714,11 @@ static netdev_tx_t team_xmit(struct sk_buff >> *skb, struct net_device *dev) >> bool tx_success; >> unsigned int len = skb->len; >> >> + if (len < 14) { >> + pr_err_once("team_xmit(len=%u)\n", len); >> + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); >> + WARN_ON_ONCE(1); >> + } >> tx_success = team_queue_override_transmit(team, skb); >> if (!tx_success) >> tx_success = team->ops.transmit(team, skb); >> >> >> And I get (with your suggestion instead of skb->len) > > Missing part in my copy/paste : > > [ 41.123829] team_xmit(len=1) > [ 41.124335] skb len=1 headroom=78 headlen=1 tailroom=113 > >> mac=(78,0) net=(78,-1) trans=-1 Interesting. Could you also dump dev->type and/or dev->min_header_len? I suspect this case may not be ARPHRD_ETHER in team. Above says mac=(78,0), so mac len is 0 and the check against the dev->min_header_len should have dropped it if it went that branch. I wonder, is team driver missing sth like : diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index 0a44bbdcfb7b..6256f0d2f565 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -2124,6 +2124,7 @@ static void team_setup_by_port(struct net_device *dev, dev->type = port_dev->type; dev->hard_header_len = port_dev->hard_header_len; dev->needed_headroom = port_dev->needed_headroom; + dev->min_header_len = port_dev->min_header_len; dev->addr_len = port_dev->addr_len; dev->mtu = port_dev->mtu; memcpy(dev->broadcast, port_dev->broadcast, port_dev->addr_len); >> shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) >> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) >> hash(0x0 sw=0 l4=0) proto=0x88a8 pkttype=3 iif=0 >> [ 41.126553] dev name=team0 feat=0x0000e0064fddfbe9 >> [ 41.127132] skb linear: 00000000: 55 >> [ 41.128487] ------------[ cut here ]------------ >> [ 41.128551] WARNING: CPU: 2 PID: 1880 at >> drivers/net/team/team.c:1720 team_xmit (drivers/net/team/team.c:1720 >> (discriminator 1)) >> [ 41.129072] Modules linked in: macsec macvtap macvlan hsr wireguard >> curve25519_x86_64 libcurve25519_generic libchacha20poly1305 >> chacha_x86_64 libchacha poly1305_x86_64 batman_adv dummy bridge sr_mod >> cdrom evdev pcspkr i2c_piix4 9pnet_virtio 9p netfs 9pnet >> [ 41.129613] CPU: 2 PID: 1880 Comm: b330650301 Not tainted 6.8.0-virtme #238 >> [ 41.129664] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >> BIOS 1.16.3-debian-1.16.3-2 04/01/2014 >> [ 41.129780] RIP: 0010:team_xmit (drivers/net/team/team.c:1720 >> (discriminator 1)) >> [ 41.129847] Code: 41 54 53 44 8b 7f 70 48 89 fb 41 83 ff 0d 77 1c 80 >> 3d a0 24 8d 01 00 0f 84 0d 01 00 00 80 3d 92 24 8d 01 00 0f 84 e3 00 >> 00 00 <0f> 0b 41 80 be 21 0b 00 00 00 0f 84 9d 00 00 00 0f b7 43 7c 66 >> 85 >> All code >> ======== >> 0: 41 54 push %r12 >> 2: 53 push %rbx >> 3: 44 8b 7f 70 mov 0x70(%rdi),%r15d >> 7: 48 89 fb mov %rdi,%rbx >> a: 41 83 ff 0d cmp $0xd,%r15d >> e: 77 1c ja 0x2c >> 10: 80 3d a0 24 8d 01 00 cmpb $0x0,0x18d24a0(%rip) # 0x18d24b7 >> 17: 0f 84 0d 01 00 00 je 0x12a >> 1d: 80 3d 92 24 8d 01 00 cmpb $0x0,0x18d2492(%rip) # 0x18d24b6 >> 24: 0f 84 e3 00 00 00 je 0x10d >> 2a:* 0f 0b ud2 <-- trapping instruction >> 2c: 41 80 be 21 0b 00 00 cmpb $0x0,0xb21(%r14) >> 33: 00 >> 34: 0f 84 9d 00 00 00 je 0xd7 >> 3a: 0f b7 43 7c movzwl 0x7c(%rbx),%eax >> 3e: 66 data16 >> 3f: 85 .byte 0x85 >> >> Code starting with the faulting instruction >> =========================================== >> 0: 0f 0b ud2 >> 2: 41 80 be 21 0b 00 00 cmpb $0x0,0xb21(%r14) >> 9: 00 >> a: 0f 84 9d 00 00 00 je 0xad >> 10: 0f b7 43 7c movzwl 0x7c(%rbx),%eax >> 14: 66 data16 >> 15: 85 .byte 0x85 >> [ 41.129902] RSP: 0018:ffffa4210433b938 EFLAGS: 00000246 >> [ 41.129945] RAX: 0000000000000000 RBX: ffffa4210858a300 RCX: 0000000000000000 >> [ 41.129961] RDX: 0000000000000000 RSI: 00000000ffff7fff RDI: 0000000000000001 >> [ 41.129975] RBP: ffffa4210433b960 R08: 0000000000000000 R09: ffffa4210433b630 >> [ 41.129989] R10: 0000000000000001 R11: ffffffff8407d340 R12: 0000000000000000 >> [ 41.130004] R13: ffffa4210ecee000 R14: ffffa4210ece4000 R15: 0000000000000001 >> [ 41.130074] FS: 00007f91d9549740(0000) GS:ffffa42fffa80000(0000) >> knlGS:0000000000000000 >> [ 41.130095] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 41.130140] CR2: 00007f8953077fb0 CR3: 0000000104f42000 CR4: 00000000000006f0 >> [ 41.130229] Call Trace: >> [ 41.130331] <TASK> >> [ 41.130530] ? show_regs (arch/x86/kernel/dumpstack.c:479) >> [ 41.130598] ? __warn (kernel/panic.c:694) >> [ 41.130611] ? team_xmit (drivers/net/team/team.c:1720 (discriminator 1)) >> [ 41.130625] ? report_bug (lib/bug.c:180 lib/bug.c:219) >> [ 41.130640] ? handle_bug (arch/x86/kernel/traps.c:239) >> [ 41.130653] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) >> [ 41.130665] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621) >> [ 41.130700] ? team_xmit (drivers/net/team/team.c:1720 (discriminator 1)) >> [ 41.130714] ? team_xmit (drivers/net/team/team.c:1719 (discriminator 6)) >> [ 41.130734] dev_hard_start_xmit (./include/linux/netdevice.h:4903 >> ./include/linux/netdevice.h:4917 net/core/dev.c:3531 >> net/core/dev.c:3547) >> [ 41.130768] __dev_queue_xmit (./include/linux/netdevice.h:3287 >> (discriminator 25) net/core/dev.c:4336 (discriminator 25)) >> [ 41.130780] ? kmalloc_reserve (net/core/skbuff.c:580 (discriminator 4)) >> [ 41.130796] ? pskb_expand_head (net/core/skbuff.c:2292) >> [ 41.130815] __bpf_redirect (./include/linux/netdevice.h:3287 >> (discriminator 25) net/core/filter.c:2143 (discriminator 25) >> net/core/filter.c:2172 (discriminator 25) net/core/filter.c:2196 >> (discriminator 25)) >> [ 41.130825] bpf_clone_redirect (net/core/filter.c:2467 >> (discriminator 1) net/core/filter.c:2439 (discriminator 1)) >> [ 41.130841] bpf_prog_9845f5eee09e82c6+0x61/0x66 >> [ 41.130948] ? bpf_ksym_find (./include/linux/rbtree_latch.h:113 >> ./include/linux/rbtree_latch.h:208 kernel/bpf/core.c:734) >> [ 41.130963] ? is_bpf_text_address >> (./arch/x86/include/asm/preempt.h:84 (discriminator 13) >> ./include/linux/rcupdate.h:97 (discriminator 13) >> ./include/linux/rcupdate.h:813 (discriminator 13) >> kernel/bpf/core.c:769 (discriminator 13)) >> [ 41.130976] ? kernel_text_address (kernel/extable.c:125 >> (discriminator 1) kernel/extable.c:94 (discriminator 1)) >> [ 41.130989] ? __kernel_text_address (kernel/extable.c:79 (discriminator 1)) >> [ 41.131002] ? unwind_get_return_address >> (arch/x86/kernel/unwind_frame.c:19 (discriminator 1)) >> [ 41.131014] ? __pfx_stack_trace_consume_entry (kernel/stacktrace.c:83) >> [ 41.131028] ? arch_stack_walk (arch/x86/kernel/stacktrace.c:26) >> [ 41.131044] ? stack_depot_save_flags (lib/stackdepot.c:675) >> [ 41.131062] ? ktime_get (kernel/time/timekeeping.c:292 >> kernel/time/timekeeping.c:388 kernel/time/timekeeping.c:848) >> [ 41.131076] bpf_test_run (./include/linux/bpf.h:1234 >> ./include/linux/filter.h:657 ./include/linux/filter.h:664 >> net/bpf/test_run.c:425) >> [ 41.131087] ? security_sk_alloc (security/security.c:4662 (discriminator 13)) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-26 17:57 ` Daniel Borkmann @ 2024-03-26 18:08 ` Eric Dumazet 2024-09-22 17:39 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: Eric Dumazet @ 2024-03-26 18:08 UTC (permalink / raw) To: Daniel Borkmann Cc: Stanislav Fomichev, Alexei Starovoitov, Guillaume Nault, patchwork-bot+netdevbpf, David S. Miller, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Willem de Bruijn On Tue, Mar 26, 2024 at 6:57 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 3/26/24 2:38 PM, Eric Dumazet wrote: > > On Tue, Mar 26, 2024 at 2:37 PM Eric Dumazet <edumazet@google.com> wrote: > >> On Tue, Mar 26, 2024 at 1:46 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > >>> On 3/25/24 5:28 PM, Stanislav Fomichev wrote: > >>>> On 03/25, Alexei Starovoitov wrote: > >>>>> On Mon, Mar 25, 2024 at 6:33 AM Eric Dumazet <edumazet@google.com> wrote: > >>>>>> On Sat, Mar 23, 2024 at 4:02 AM Alexei Starovoitov > >>>>>> <alexei.starovoitov@gmail.com> wrote: > >>>>>>> On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: > >>>>>>>> > >>>>>>>> Hello: > >>>>>>>> > >>>>>>>> This patch was applied to bpf/bpf.git (master) > >>>>>>>> by Daniel Borkmann <daniel@iogearbox.net>: > >>>>>>>> > >>>>>>>> On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: > >>>>>>>>> Some drivers ndo_start_xmit() expect a minimal size, as shown > >>>>>>>>> by various syzbot reports [1]. > >>>>>>>>> > >>>>>>>>> Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") > >>>>>>>>> the missing attribute that can be used by upper layers. > >>>>>>>>> > >>>>>>>>> We need to use it in __bpf_redirect_common(). > >>>>>>> > >>>>>>> This patch broke empty_skb test: > >>>>>>> $ test_progs -t empty_skb > >>>>>>> > >>>>>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > >>>>>>> [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress > >>>>>>> [redirect_ingress]: actual -34 != expected 0 > >>>>>>> test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec > >>>>>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > >>>>>>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > >>>>>>> [redirect_egress]: actual -34 != expected 1 > >>>>>>> > >>>>>>> And looking at the test I think it's not a test issue. > >>>>>>> This check > >>>>>>> if (unlikely(skb->len < dev->min_header_len)) > >>>>>>> is rejecting more than it should. > >>>>>>> > >>>>>>> So I reverted this patch for now. > >>>>>> > >>>>>> OK, it seems I missed __bpf_rx_skb() vs __bpf_tx_skb(), but even if I > >>>>>> move my sanity test in __bpf_tx_skb(), > >>>>>> the bpf test program still fails, I am suspecting the test needs to be adjusted. > >>>>>> > >>>>>> diff --git a/net/core/filter.c b/net/core/filter.c > >>>>>> index 745697c08acb3a74721d26ee93389efa81e973a0..e9c0e2087a08f1d8afd2c3e8e7871ddc9231b76d > >>>>>> 100644 > >>>>>> --- a/net/core/filter.c > >>>>>> +++ b/net/core/filter.c > >>>>>> @@ -2128,6 +2128,12 @@ static inline int __bpf_tx_skb(struct > >>>>>> net_device *dev, struct sk_buff *skb) > >>>>>> return -ENETDOWN; > >>>>>> } > >>>>>> > >>>>>> + if (unlikely(skb->len < dev->min_header_len)) { > >>>>>> + pr_err_once("__bpf_tx_skb skb->len=%u < > >>>>>> dev(%s)->min_header_len(%u)\n", skb->len, dev->name, > >>>>>> dev->min_header_len); > >>>>>> + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); > >>>>>> + kfree_skb(skb); > >>>>>> + return -ERANGE; > >>>>>> + } // Note: this is before we change skb->dev > >>>>>> skb->dev = dev; > >>>>>> skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb)); > >>>>>> skb_clear_tstamp(skb); > >>>>>> > >>>>>> > >>>>>> --> > >>>>>> > >>>>>> > >>>>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > >>>>>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > >>>>>> [redirect_egress]: actual -34 != expected 1 > >>>>>> > >>>>>> [ 58.382051] __bpf_tx_skb skb->len=1 < dev(veth0)->min_header_len(14) > >>>>>> [ 58.382778] skb len=1 headroom=78 headlen=1 tailroom=113 > >>>>>> mac=(64,14) net=(78,-1) trans=-1 > >>>>>> shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > >>>>>> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) > >>>>>> hash(0x0 sw=0 l4=0) proto=0x7f00 pkttype=0 iif=0 > >>>>> > >>>>> Hmm. Something is off. > >>>>> That test creates 15 byte skb. > >>>>> It's not obvious to me how it got reduced to 1. > >>>>> Something stripped L2 header and the prog is trying to redirect > >>>>> such skb into veth that expects skb with L2 ? > >>>>> > >>>>> Stan, > >>>>> please take a look. > >>>>> Since you wrote that test. > >>>> > >>>> Sure. Daniel wants to take a look on a separate thread, so we can sync > >>>> up. Tentatively, seems like the failure is in the lwt path that does > >>>> indeed drop the l2. > >>> > >>> If we'd change the test into the below, the tc and empty_skb tests pass. > >>> run_lwt_bpf() calls into skb_do_redirect() which has L2 stripped, and thus > >>> skb->len is 1 in this test. We do use skb_mac_header_len() also in other > >>> tc BPF helpers, so perhaps s/skb->len/skb_mac_header_len(skb)/ is the best > >>> way forward.. > >>> > >>> static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev, > >>> u32 flags) > >>> { > >>> /* Verify that a link layer header is carried */ > >>> if (unlikely(skb->mac_header >= skb->network_header || skb->len == 0)) { > >>> kfree_skb(skb); > >>> return -ERANGE; > >>> } > >>> > >>> if (unlikely(skb_mac_header_len(skb) < dev->min_header_len)) { > >> > >> Unfortunately this will not prevent frames with skb->len == 1 to reach > >> an Ethernet driver ndo_start_xmit() > >> > >> At ndo_start_xmit(), we do not look where the MAC header supposedly > >> starts in the skb, we only use skb->data > >> > >> I have a syzbot repro using team driver, so I added the following part in team : > >> > >> diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c > >> index 0a44bbdcfb7b9f30a0c27b700246501c5eba322f..75e5ef585a8f05b35cfddbae0bfc377864e6e38c > >> 100644 > >> --- a/drivers/net/team/team.c > >> +++ b/drivers/net/team/team.c > >> @@ -1714,6 +1714,11 @@ static netdev_tx_t team_xmit(struct sk_buff > >> *skb, struct net_device *dev) > >> bool tx_success; > >> unsigned int len = skb->len; > >> > >> + if (len < 14) { > >> + pr_err_once("team_xmit(len=%u)\n", len); > >> + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); > >> + WARN_ON_ONCE(1); > >> + } > >> tx_success = team_queue_override_transmit(team, skb); > >> if (!tx_success) > >> tx_success = team->ops.transmit(team, skb); > >> > >> > >> And I get (with your suggestion instead of skb->len) > > > > Missing part in my copy/paste : > > > > [ 41.123829] team_xmit(len=1) > > [ 41.124335] skb len=1 headroom=78 headlen=1 tailroom=113 > > > >> mac=(78,0) net=(78,-1) trans=-1 > > Interesting. > > Could you also dump dev->type and/or dev->min_header_len? I suspect > this case may not be ARPHRD_ETHER in team. > > Above says mac=(78,0), so mac len is 0 and the check against the > dev->min_header_len should have dropped it if it went that branch. mac header is reset in __dev_queue_xmit() : skb_reset_mac_header(skb); So when the bpf code ran, skb_mac_header_len(skb) was 14, but later the MAC header was set (to skb->data) > > I wonder, is team driver missing sth like : > > diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c > index 0a44bbdcfb7b..6256f0d2f565 100644 > --- a/drivers/net/team/team.c > +++ b/drivers/net/team/team.c > @@ -2124,6 +2124,7 @@ static void team_setup_by_port(struct net_device *dev, > dev->type = port_dev->type; > dev->hard_header_len = port_dev->hard_header_len; > dev->needed_headroom = port_dev->needed_headroom; > + dev->min_header_len = port_dev->min_header_len; > dev->addr_len = port_dev->addr_len; > dev->mtu = port_dev->mtu; > memcpy(dev->broadcast, port_dev->broadcast, port_dev->addr_len); > I have confirmed that team min_header_len is 14, nothing seems to be missing I think. team_xmit(dev team0, skb->len=1, dev->min_header_len=14) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] bpf: Don't redirect too small packets 2024-03-26 18:08 ` Eric Dumazet @ 2024-09-22 17:39 ` Eric Dumazet 0 siblings, 0 replies; 13+ messages in thread From: Eric Dumazet @ 2024-09-22 17:39 UTC (permalink / raw) To: Daniel Borkmann Cc: Stanislav Fomichev, Alexei Starovoitov, Guillaume Nault, patchwork-bot+netdevbpf, David S. Miller, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko, Network Development, bpf, Eric Dumazet, syzbot+9e27778c0edc62cb97d8, Willem de Bruijn On Tue, Mar 26, 2024 at 7:08 PM Eric Dumazet <edumazet@google.com> wrote: > > On Tue, Mar 26, 2024 at 6:57 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > On 3/26/24 2:38 PM, Eric Dumazet wrote: > > > On Tue, Mar 26, 2024 at 2:37 PM Eric Dumazet <edumazet@google.com> wrote: > > >> On Tue, Mar 26, 2024 at 1:46 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > >>> On 3/25/24 5:28 PM, Stanislav Fomichev wrote: > > >>>> On 03/25, Alexei Starovoitov wrote: > > >>>>> On Mon, Mar 25, 2024 at 6:33 AM Eric Dumazet <edumazet@google.com> wrote: > > >>>>>> On Sat, Mar 23, 2024 at 4:02 AM Alexei Starovoitov > > >>>>>> <alexei.starovoitov@gmail.com> wrote: > > >>>>>>> On Fri, Mar 22, 2024 at 7:10 AM <patchwork-bot+netdevbpf@kernel.org> wrote: > > >>>>>>>> > > >>>>>>>> Hello: > > >>>>>>>> > > >>>>>>>> This patch was applied to bpf/bpf.git (master) > > >>>>>>>> by Daniel Borkmann <daniel@iogearbox.net>: > > >>>>>>>> > > >>>>>>>> On Fri, 22 Mar 2024 12:24:07 +0000 you wrote: > > >>>>>>>>> Some drivers ndo_start_xmit() expect a minimal size, as shown > > >>>>>>>>> by various syzbot reports [1]. > > >>>>>>>>> > > >>>>>>>>> Willem added in commit 217e6fa24ce2 ("net: introduce device min_header_len") > > >>>>>>>>> the missing attribute that can be used by upper layers. > > >>>>>>>>> > > >>>>>>>>> We need to use it in __bpf_redirect_common(). > > >>>>>>> > > >>>>>>> This patch broke empty_skb test: > > >>>>>>> $ test_progs -t empty_skb > > >>>>>>> > > >>>>>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > >>>>>>> [redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress > > >>>>>>> [redirect_ingress]: actual -34 != expected 0 > > >>>>>>> test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress [redirect_egress] 0 nsec > > >>>>>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > >>>>>>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > > >>>>>>> [redirect_egress]: actual -34 != expected 1 > > >>>>>>> > > >>>>>>> And looking at the test I think it's not a test issue. > > >>>>>>> This check > > >>>>>>> if (unlikely(skb->len < dev->min_header_len)) > > >>>>>>> is rejecting more than it should. > > >>>>>>> > > >>>>>>> So I reverted this patch for now. > > >>>>>> > > >>>>>> OK, it seems I missed __bpf_rx_skb() vs __bpf_tx_skb(), but even if I > > >>>>>> move my sanity test in __bpf_tx_skb(), > > >>>>>> the bpf test program still fails, I am suspecting the test needs to be adjusted. > > >>>>>> > > >>>>>> diff --git a/net/core/filter.c b/net/core/filter.c > > >>>>>> index 745697c08acb3a74721d26ee93389efa81e973a0..e9c0e2087a08f1d8afd2c3e8e7871ddc9231b76d > > >>>>>> 100644 > > >>>>>> --- a/net/core/filter.c > > >>>>>> +++ b/net/core/filter.c > > >>>>>> @@ -2128,6 +2128,12 @@ static inline int __bpf_tx_skb(struct > > >>>>>> net_device *dev, struct sk_buff *skb) > > >>>>>> return -ENETDOWN; > > >>>>>> } > > >>>>>> > > >>>>>> + if (unlikely(skb->len < dev->min_header_len)) { > > >>>>>> + pr_err_once("__bpf_tx_skb skb->len=%u < > > >>>>>> dev(%s)->min_header_len(%u)\n", skb->len, dev->name, > > >>>>>> dev->min_header_len); > > >>>>>> + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); > > >>>>>> + kfree_skb(skb); > > >>>>>> + return -ERANGE; > > >>>>>> + } // Note: this is before we change skb->dev > > >>>>>> skb->dev = dev; > > >>>>>> skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb)); > > >>>>>> skb_clear_tstamp(skb); > > >>>>>> > > >>>>>> > > >>>>>> --> > > >>>>>> > > >>>>>> > > >>>>>> test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress > > >>>>>> [redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress > > >>>>>> [redirect_egress]: actual -34 != expected 1 > > >>>>>> > > >>>>>> [ 58.382051] __bpf_tx_skb skb->len=1 < dev(veth0)->min_header_len(14) > > >>>>>> [ 58.382778] skb len=1 headroom=78 headlen=1 tailroom=113 > > >>>>>> mac=(64,14) net=(78,-1) trans=-1 > > >>>>>> shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > > >>>>>> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) > > >>>>>> hash(0x0 sw=0 l4=0) proto=0x7f00 pkttype=0 iif=0 > > >>>>> > > >>>>> Hmm. Something is off. > > >>>>> That test creates 15 byte skb. > > >>>>> It's not obvious to me how it got reduced to 1. > > >>>>> Something stripped L2 header and the prog is trying to redirect > > >>>>> such skb into veth that expects skb with L2 ? > > >>>>> > > >>>>> Stan, > > >>>>> please take a look. > > >>>>> Since you wrote that test. > > >>>> > > >>>> Sure. Daniel wants to take a look on a separate thread, so we can sync > > >>>> up. Tentatively, seems like the failure is in the lwt path that does > > >>>> indeed drop the l2. > > >>> > > >>> If we'd change the test into the below, the tc and empty_skb tests pass. > > >>> run_lwt_bpf() calls into skb_do_redirect() which has L2 stripped, and thus > > >>> skb->len is 1 in this test. We do use skb_mac_header_len() also in other > > >>> tc BPF helpers, so perhaps s/skb->len/skb_mac_header_len(skb)/ is the best > > >>> way forward.. > > >>> > > >>> static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev, > > >>> u32 flags) > > >>> { > > >>> /* Verify that a link layer header is carried */ > > >>> if (unlikely(skb->mac_header >= skb->network_header || skb->len == 0)) { > > >>> kfree_skb(skb); > > >>> return -ERANGE; > > >>> } > > >>> > > >>> if (unlikely(skb_mac_header_len(skb) < dev->min_header_len)) { > > >> > > >> Unfortunately this will not prevent frames with skb->len == 1 to reach > > >> an Ethernet driver ndo_start_xmit() > > >> > > >> At ndo_start_xmit(), we do not look where the MAC header supposedly > > >> starts in the skb, we only use skb->data > > >> > > >> I have a syzbot repro using team driver, so I added the following part in team : > > >> > > >> diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c > > >> index 0a44bbdcfb7b9f30a0c27b700246501c5eba322f..75e5ef585a8f05b35cfddbae0bfc377864e6e38c > > >> 100644 > > >> --- a/drivers/net/team/team.c > > >> +++ b/drivers/net/team/team.c > > >> @@ -1714,6 +1714,11 @@ static netdev_tx_t team_xmit(struct sk_buff > > >> *skb, struct net_device *dev) > > >> bool tx_success; > > >> unsigned int len = skb->len; > > >> > > >> + if (len < 14) { > > >> + pr_err_once("team_xmit(len=%u)\n", len); > > >> + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false); > > >> + WARN_ON_ONCE(1); > > >> + } > > >> tx_success = team_queue_override_transmit(team, skb); > > >> if (!tx_success) > > >> tx_success = team->ops.transmit(team, skb); > > >> > > >> > > >> And I get (with your suggestion instead of skb->len) > > > > > > Missing part in my copy/paste : > > > > > > [ 41.123829] team_xmit(len=1) > > > [ 41.124335] skb len=1 headroom=78 headlen=1 tailroom=113 > > > > > >> mac=(78,0) net=(78,-1) trans=-1 > > > > Interesting. > > > > Could you also dump dev->type and/or dev->min_header_len? I suspect > > this case may not be ARPHRD_ETHER in team. > > > > Above says mac=(78,0), so mac len is 0 and the check against the > > dev->min_header_len should have dropped it if it went that branch. > > mac header is reset in __dev_queue_xmit() : > > skb_reset_mac_header(skb); > > So when the bpf code ran, skb_mac_header_len(skb) was 14, > but later the MAC header was set (to skb->data) > > > > > I wonder, is team driver missing sth like : > > > > diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c > > index 0a44bbdcfb7b..6256f0d2f565 100644 > > --- a/drivers/net/team/team.c > > +++ b/drivers/net/team/team.c > > @@ -2124,6 +2124,7 @@ static void team_setup_by_port(struct net_device *dev, > > dev->type = port_dev->type; > > dev->hard_header_len = port_dev->hard_header_len; > > dev->needed_headroom = port_dev->needed_headroom; > > + dev->min_header_len = port_dev->min_header_len; > > dev->addr_len = port_dev->addr_len; > > dev->mtu = port_dev->mtu; > > memcpy(dev->broadcast, port_dev->broadcast, port_dev->addr_len); > > > > > I have confirmed that team min_header_len is 14, nothing seems to be > missing I think. > > team_xmit(dev team0, skb->len=1, dev->min_header_len=14) FYI, I am releasing today a syzbot report with the same problem. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-09-22 17:39 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-22 12:24 [PATCH net] bpf: Don't redirect too small packets Eric Dumazet 2024-03-22 14:10 ` patchwork-bot+netdevbpf 2024-03-23 3:01 ` Alexei Starovoitov 2024-03-25 13:33 ` Eric Dumazet 2024-03-25 15:47 ` Daniel Borkmann 2024-03-25 15:56 ` Alexei Starovoitov 2024-03-25 16:28 ` Stanislav Fomichev 2024-03-26 12:46 ` Daniel Borkmann 2024-03-26 13:37 ` Eric Dumazet 2024-03-26 13:38 ` Eric Dumazet 2024-03-26 17:57 ` Daniel Borkmann 2024-03-26 18:08 ` Eric Dumazet 2024-09-22 17:39 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).