* [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init()
@ 2024-09-24 15:02 Eric Dumazet
2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-09-24 15:02 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
eric.dumazet, Eric Dumazet
Inspired by one syzbot report.
At least one qdisc (fq_codel) depends on qdisc_skb_cb(skb)->pkt_len
having a sane value (not zero)
With the help of af_packet, syzbot was able to fool qdisc_pkt_len_init()
to precisely set qdisc_skb_cb(skb)->pkt_len to zero.
First patch fixes this issue.
Second one (a separate one to help future bisections) adds
more sanity check to SKB_GSO_DODGY users.
Eric Dumazet (2):
net: avoid potential underflow in qdisc_pkt_len_init() with UFO
net: add more sanity checks to qdisc_pkt_len_init()
net/core/dev.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
--
2.46.0.792.g87dc391469-goog
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO
2024-09-24 15:02 [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() Eric Dumazet
@ 2024-09-24 15:02 ` Eric Dumazet
2024-09-26 9:07 ` Willem de Bruijn
2024-09-26 18:00 ` Jonathan Davies
2024-09-24 15:02 ` [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init() Eric Dumazet
` (2 subsequent siblings)
3 siblings, 2 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-09-24 15:02 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
eric.dumazet, Eric Dumazet, syzbot
After commit 7c6d2ecbda83 ("net: be more gentle about silly gso
requests coming from user") virtio_net_hdr_to_skb() had sanity check
to detect malicious attempts from user space to cook a bad GSO packet.
Then commit cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count
transport header in UFO") while fixing one issue, allowed user space
to cook a GSO packet with the following characteristic :
IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.
When this packet arrives in qdisc_pkt_len_init(), we end up
with hdr_len = 28 (IPv4 header + UDP header), matching skb->len
Then the following sets gso_segs to 0 :
gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
shinfo->gso_size);
Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/
qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
This leads to the following crash in fq_codel [1]
qdisc_pkt_len_init() is best effort, we only want an estimation
of the bytes sent on the wire, not crashing the kernel.
This patch is fixing this particular issue, a following one
adds more sanity checks for another potential bug.
[1]
[ 70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 70.724561] #PF: supervisor read access in kernel mode
[ 70.724561] #PF: error_code(0x0000) - not-present page
[ 70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
[ 70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
[ 70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
[ 70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
[ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
All code
========
0: 24 08 and $0x8,%al
2: 49 c1 e1 06 shl $0x6,%r9
6: 44 89 7c 24 18 mov %r15d,0x18(%rsp)
b: 45 31 ed xor %r13d,%r13d
e: 45 31 c0 xor %r8d,%r8d
11: 31 ff xor %edi,%edi
13: 89 44 24 14 mov %eax,0x14(%rsp)
17: 4c 03 8b 90 01 00 00 add 0x190(%rbx),%r9
1e: eb 04 jmp 0x24
20: 39 ca cmp %ecx,%edx
22: 73 37 jae 0x5b
24: 4d 8b 39 mov (%r9),%r15
27: 83 c7 01 add $0x1,%edi
2a:* 49 8b 17 mov (%r15),%rdx <-- trapping instruction
2d: 49 89 11 mov %rdx,(%r9)
30: 41 8b 57 28 mov 0x28(%r15),%edx
34: 45 8b 5f 34 mov 0x34(%r15),%r11d
38: 49 c7 07 00 00 00 00 movq $0x0,(%r15)
3f: 49 rex.WB
Code starting with the faulting instruction
===========================================
0: 49 8b 17 mov (%r15),%rdx
3: 49 89 11 mov %rdx,(%r9)
6: 41 8b 57 28 mov 0x28(%r15),%edx
a: 45 8b 5f 34 mov 0x34(%r15),%r11d
e: 49 c7 07 00 00 00 00 movq $0x0,(%r15)
15: 49 rex.WB
[ 70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
[ 70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
[ 70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
[ 70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
[ 70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
[ 70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
[ 70.724561] FS: 000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
[ 70.724561] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
[ 70.724561] Call Trace:
[ 70.724561] <TASK>
[ 70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[ 70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
[ 70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
[ 70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
[ 70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
[ 70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
[ 70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
[ 70.724561] ? irqentry_enter (kernel/entry/common.c:237)
[ 70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
[ 70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
[ 70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
[ 70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
[ 70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
[ 70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
[ 70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
[ 70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
[ 70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
[ 70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
[ 70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
[ 70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 70.724561] RIP: 0033:0x41ae09
Fixes: cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count transport header in UFO")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jonathan Davies <jonathan.davies@nutanix.com>
---
net/core/dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 1e740faf9e783b047b2dc7d9fd4242e4e6c7317a..f2c47da79f17d5ebe6b334b63d66c84c84c519fc 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3758,7 +3758,7 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
sizeof(_tcphdr), &_tcphdr);
if (likely(th))
hdr_len += __tcp_hdrlen(th);
- } else {
+ } else if (shinfo->gso_type & SKB_GSO_UDP_L4) {
struct udphdr _udphdr;
if (skb_header_pointer(skb, hdr_len,
--
2.46.0.792.g87dc391469-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-24 15:02 [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() Eric Dumazet
2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
@ 2024-09-24 15:02 ` Eric Dumazet
2024-09-25 17:51 ` Joe Damato
2024-09-26 9:13 ` Willem de Bruijn
2024-09-26 2:26 ` [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() David Ahern
2024-10-01 10:10 ` patchwork-bot+netdevbpf
3 siblings, 2 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-09-24 15:02 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
eric.dumazet, Eric Dumazet
One path takes care of SKB_GSO_DODGY, assuming
skb->len is bigger than hdr_len.
virtio_net_hdr_to_skb() does not fully dissect TCP headers,
it only make sure it is at least 20 bytes.
It is possible for an user to provide a malicious 'GSO' packet,
total length of 80 bytes.
- 20 bytes of IPv4 header
- 60 bytes TCP header
- a small gso_size like 8
virtio_net_hdr_to_skb() would declare this packet as a normal
GSO packet, because it would see 40 bytes of payload,
bigger than gso_size.
We need to make detect this case to not underflow
qdisc_skb_cb(skb)->pkt_len.
Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/core/dev.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index f2c47da79f17d5ebe6b334b63d66c84c84c519fc..35b8bcfb209bd274c81380eaf6e445641306b018 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3766,10 +3766,14 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
hdr_len += sizeof(struct udphdr);
}
- if (shinfo->gso_type & SKB_GSO_DODGY)
- gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
- shinfo->gso_size);
+ if (unlikely(shinfo->gso_type & SKB_GSO_DODGY)) {
+ int payload = skb->len - hdr_len;
+ /* Malicious packet. */
+ if (payload <= 0)
+ return;
+ gso_segs = DIV_ROUND_UP(payload, shinfo->gso_size);
+ }
qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
}
}
--
2.46.0.792.g87dc391469-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-24 15:02 ` [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init() Eric Dumazet
@ 2024-09-25 17:51 ` Joe Damato
2024-09-25 18:00 ` Eric Dumazet
2024-09-26 9:13 ` Willem de Bruijn
1 sibling, 1 reply; 20+ messages in thread
From: Joe Damato @ 2024-09-25 17:51 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet
On Tue, Sep 24, 2024 at 03:02:57PM +0000, Eric Dumazet wrote:
> One path takes care of SKB_GSO_DODGY, assuming
> skb->len is bigger than hdr_len.
My only comment, which you may feel free to ignore, is that we've
recently merged a change to replace the term 'sanity check' in the
code [1].
Given that work is being done to replace terminology in the source
code, I am wondering if that same ruling applies to commit messages.
If so, perhaps the title of this commit can be adjusted?
[1]: https://lore.kernel.org/netdev/20240912171446.12854-1-stephen@networkplumber.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-25 17:51 ` Joe Damato
@ 2024-09-25 18:00 ` Eric Dumazet
2024-09-25 18:24 ` Joe Damato
0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2024-09-25 18:00 UTC (permalink / raw)
To: Joe Damato, Eric Dumazet, David S . Miller, Jakub Kicinski,
Paolo Abeni, David Ahern, netdev, Willem de Bruijn,
Jonathan Davies, eric.dumazet
On Wed, Sep 25, 2024 at 7:52 PM Joe Damato <jdamato@fastly.com> wrote:
>
> On Tue, Sep 24, 2024 at 03:02:57PM +0000, Eric Dumazet wrote:
> > One path takes care of SKB_GSO_DODGY, assuming
> > skb->len is bigger than hdr_len.
>
> My only comment, which you may feel free to ignore, is that we've
> recently merged a change to replace the term 'sanity check' in the
> code [1].
>
> Given that work is being done to replace terminology in the source
> code, I am wondering if that same ruling applies to commit messages.
>
> If so, perhaps the title of this commit can be adjusted?
>
> [1]: https://lore.kernel.org/netdev/20240912171446.12854-1-stephen@networkplumber.org/
I guess I could write the changelog in French, to make sure it is all good.
git log --oneline --grep "sanity check" | wc -l
3397
I dunno...
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-25 18:00 ` Eric Dumazet
@ 2024-09-25 18:24 ` Joe Damato
2024-09-25 18:27 ` Stephen Hemminger
2024-09-25 18:55 ` Eric Dumazet
0 siblings, 2 replies; 20+ messages in thread
From: Joe Damato @ 2024-09-25 18:24 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet
On Wed, Sep 25, 2024 at 08:00:08PM +0200, Eric Dumazet wrote:
> On Wed, Sep 25, 2024 at 7:52 PM Joe Damato <jdamato@fastly.com> wrote:
> >
> > On Tue, Sep 24, 2024 at 03:02:57PM +0000, Eric Dumazet wrote:
> > > One path takes care of SKB_GSO_DODGY, assuming
> > > skb->len is bigger than hdr_len.
> >
> > My only comment, which you may feel free to ignore, is that we've
> > recently merged a change to replace the term 'sanity check' in the
> > code [1].
> >
> > Given that work is being done to replace terminology in the source
> > code, I am wondering if that same ruling applies to commit messages.
> >
> > If so, perhaps the title of this commit can be adjusted?
> >
> > [1]: https://lore.kernel.org/netdev/20240912171446.12854-1-stephen@networkplumber.org/
>
> I guess I could write the changelog in French, to make sure it is all good.
You could, but then I'd probably email you and remind you that the
documentation explicitly says "It’s important to describe the change
in plain English" [1].
> git log --oneline --grep "sanity check" | wc -l
> 3397
I don't know what this means. We've done it in the past and so
should continue to do it in the future? OK.
> I dunno...
Me either, but if code is being merged as recently as a few days ago
to remove this term...
[1]: https://www.kernel.org/doc/html/v6.11/process/submitting-patches.html#describe-your-changes
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-25 18:24 ` Joe Damato
@ 2024-09-25 18:27 ` Stephen Hemminger
2024-09-25 18:55 ` Eric Dumazet
1 sibling, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2024-09-25 18:27 UTC (permalink / raw)
To: Joe Damato
Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni,
David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
eric.dumazet
On Wed, 25 Sep 2024 11:24:04 -0700
Joe Damato <jdamato@fastly.com> wrote:
> On Wed, Sep 25, 2024 at 08:00:08PM +0200, Eric Dumazet wrote:
> > On Wed, Sep 25, 2024 at 7:52 PM Joe Damato <jdamato@fastly.com> wrote:
> > >
> > > On Tue, Sep 24, 2024 at 03:02:57PM +0000, Eric Dumazet wrote:
> > > > One path takes care of SKB_GSO_DODGY, assuming
> > > > skb->len is bigger than hdr_len.
> > >
> > > My only comment, which you may feel free to ignore, is that we've
> > > recently merged a change to replace the term 'sanity check' in the
> > > code [1].
> > >
> > > Given that work is being done to replace terminology in the source
> > > code, I am wondering if that same ruling applies to commit messages.
> > >
> > > If so, perhaps the title of this commit can be adjusted?
> > >
> > > [1]: https://lore.kernel.org/netdev/20240912171446.12854-1-stephen@networkplumber.org/
> >
> > I guess I could write the changelog in French, to make sure it is all good.
>
> You could, but then I'd probably email you and remind you that the
> documentation explicitly says "It’s important to describe the change
> in plain English" [1].
>
> > git log --oneline --grep "sanity check" | wc -l
> > 3397
>
> I don't know what this means. We've done it in the past and so
> should continue to do it in the future? OK.
>
> > I dunno...
>
> Me either, but if code is being merged as recently as a few days ago
> to remove this term...
>
> [1]: https://www.kernel.org/doc/html/v6.11/process/submitting-patches.html#describe-your-changes
>
Less concerned about commit log or subject.
My patch was more about keeping non-inclusive naming out of current code base.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-25 18:24 ` Joe Damato
2024-09-25 18:27 ` Stephen Hemminger
@ 2024-09-25 18:55 ` Eric Dumazet
2024-09-25 19:01 ` Eric Dumazet
2024-09-25 19:07 ` Joe Damato
1 sibling, 2 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-09-25 18:55 UTC (permalink / raw)
To: Joe Damato, Eric Dumazet, David S . Miller, Jakub Kicinski,
Paolo Abeni, David Ahern, netdev, Willem de Bruijn,
Jonathan Davies, eric.dumazet
On Wed, Sep 25, 2024 at 8:24 PM Joe Damato <jdamato@fastly.com> wrote:
>
>
> > git log --oneline --grep "sanity check" | wc -l
> > 3397
>
> I don't know what this means. We've done it in the past and so
> should continue to do it in the future? OK.
This means that if they are in the changelogs, they can not be removed.
This is immutable stuff.
Should we zap git history because of some 'bad words' that most
authors/committers/reviewers were not even aware of?
I would understand for stuff visible in the code (comments, error messages),
but the changelogs are there and can not be changed.
Who knows, maybe in 10 years 'Malicious packet.' will be very offensive,
then we can remove/change the _comment_ I added in this patch.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-25 18:55 ` Eric Dumazet
@ 2024-09-25 19:01 ` Eric Dumazet
2024-09-25 19:15 ` Joe Damato
2024-09-25 19:07 ` Joe Damato
1 sibling, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2024-09-25 19:01 UTC (permalink / raw)
To: Joe Damato, Eric Dumazet, David S . Miller, Jakub Kicinski,
Paolo Abeni, David Ahern, netdev, Willem de Bruijn,
Jonathan Davies, eric.dumazet
On Wed, Sep 25, 2024 at 8:55 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Sep 25, 2024 at 8:24 PM Joe Damato <jdamato@fastly.com> wrote:
> >
>
> >
> > > git log --oneline --grep "sanity check" | wc -l
> > > 3397
> >
> > I don't know what this means. We've done it in the past and so
> > should continue to do it in the future? OK.
>
> This means that if they are in the changelogs, they can not be removed.
> This is immutable stuff.
> Should we zap git history because of some 'bad words' that most
> authors/committers/reviewers were not even aware of?
>
> I would understand for stuff visible in the code (comments, error messages),
> but the changelogs are there and can not be changed.
>
> Who knows, maybe in 10 years 'Malicious packet.' will be very offensive,
> then we can remove/change the _comment_ I added in this patch.
BTW, I looked at https://en.wikipedia.org/wiki/Sanity_check
and the non inclusive part is at the very end of it.
I would suggest moving it at the beginning of the article to clearly
educate all potential users.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-25 18:55 ` Eric Dumazet
2024-09-25 19:01 ` Eric Dumazet
@ 2024-09-25 19:07 ` Joe Damato
2024-09-25 19:08 ` Eric Dumazet
1 sibling, 1 reply; 20+ messages in thread
From: Joe Damato @ 2024-09-25 19:07 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet
On Wed, Sep 25, 2024 at 08:55:16PM +0200, Eric Dumazet wrote:
> On Wed, Sep 25, 2024 at 8:24 PM Joe Damato <jdamato@fastly.com> wrote:
> >
>
> >
> > > git log --oneline --grep "sanity check" | wc -l
> > > 3397
> >
> > I don't know what this means. We've done it in the past and so
> > should continue to do it in the future? OK.
>
> This means that if they are in the changelogs, they can not be removed.
> This is immutable stuff.
> Should we zap git history because of some 'bad words' that most
> authors/committers/reviewers were not even aware of?
I never suggested that. I suggested not adding more to the
changelog and also mentioned you could feel free to ignore my
comment.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-25 19:07 ` Joe Damato
@ 2024-09-25 19:08 ` Eric Dumazet
0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-09-25 19:08 UTC (permalink / raw)
To: Joe Damato, Eric Dumazet, David S . Miller, Jakub Kicinski,
Paolo Abeni, David Ahern, netdev, Willem de Bruijn,
Jonathan Davies, eric.dumazet
On Wed, Sep 25, 2024 at 9:07 PM Joe Damato <jdamato@fastly.com> wrote:
>
> On Wed, Sep 25, 2024 at 08:55:16PM +0200, Eric Dumazet wrote:
> > On Wed, Sep 25, 2024 at 8:24 PM Joe Damato <jdamato@fastly.com> wrote:
> > >
> >
> > >
> > > > git log --oneline --grep "sanity check" | wc -l
> > > > 3397
> > >
> > > I don't know what this means. We've done it in the past and so
> > > should continue to do it in the future? OK.
> >
> > This means that if they are in the changelogs, they can not be removed.
> > This is immutable stuff.
> > Should we zap git history because of some 'bad words' that most
> > authors/committers/reviewers were not even aware of?
>
> I never suggested that. I suggested not adding more to the
> changelog and also mentioned you could feel free to ignore my
> comment.
I do not ignore comments, I learnt something new today, thanks !
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-25 19:01 ` Eric Dumazet
@ 2024-09-25 19:15 ` Joe Damato
0 siblings, 0 replies; 20+ messages in thread
From: Joe Damato @ 2024-09-25 19:15 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet
On Wed, Sep 25, 2024 at 09:01:23PM +0200, Eric Dumazet wrote:
> On Wed, Sep 25, 2024 at 8:55 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Wed, Sep 25, 2024 at 8:24 PM Joe Damato <jdamato@fastly.com> wrote:
> > >
> >
> > >
> > > > git log --oneline --grep "sanity check" | wc -l
> > > > 3397
> > >
> > > I don't know what this means. We've done it in the past and so
> > > should continue to do it in the future? OK.
> >
> > This means that if they are in the changelogs, they can not be removed.
> > This is immutable stuff.
> > Should we zap git history because of some 'bad words' that most
> > authors/committers/reviewers were not even aware of?
> >
> > I would understand for stuff visible in the code (comments, error messages),
> > but the changelogs are there and can not be changed.
> >
> > Who knows, maybe in 10 years 'Malicious packet.' will be very offensive,
> > then we can remove/change the _comment_ I added in this patch.
>
> BTW, I looked at https://en.wikipedia.org/wiki/Sanity_check
> and the non inclusive part is at the very end of it.
>
> I would suggest moving it at the beginning of the article to clearly
> educate all potential users.
Stephen has provided what was needed, which was: guidance. The
guidance appears to be that code avoid this phrase, but commit
messages are of less concern.
This was not clear to me before, but it is very clear now and I have
learned something for future reviews.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init()
2024-09-24 15:02 [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() Eric Dumazet
2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
2024-09-24 15:02 ` [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init() Eric Dumazet
@ 2024-09-26 2:26 ` David Ahern
2024-10-01 10:10 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2024-09-26 2:26 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet
On 9/24/24 9:02 AM, Eric Dumazet wrote:
> Inspired by one syzbot report.
>
> At least one qdisc (fq_codel) depends on qdisc_skb_cb(skb)->pkt_len
> having a sane value (not zero)
>
> With the help of af_packet, syzbot was able to fool qdisc_pkt_len_init()
> to precisely set qdisc_skb_cb(skb)->pkt_len to zero.
>
> First patch fixes this issue.
>
> Second one (a separate one to help future bisections) adds
> more sanity check to SKB_GSO_DODGY users.
>
> Eric Dumazet (2):
> net: avoid potential underflow in qdisc_pkt_len_init() with UFO
> net: add more sanity checks to qdisc_pkt_len_init()
>
> net/core/dev.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
LGTM. For the set:
Reviewed-by: David Ahern <dsahern@kernel.org>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO
2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
@ 2024-09-26 9:07 ` Willem de Bruijn
2024-09-26 18:00 ` Jonathan Davies
1 sibling, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-26 9:07 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
eric.dumazet, Eric Dumazet, syzbot
Eric Dumazet wrote:
> After commit 7c6d2ecbda83 ("net: be more gentle about silly gso
> requests coming from user") virtio_net_hdr_to_skb() had sanity check
> to detect malicious attempts from user space to cook a bad GSO packet.
>
> Then commit cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count
> transport header in UFO") while fixing one issue, allowed user space
> to cook a GSO packet with the following characteristic :
>
> IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.
>
> When this packet arrives in qdisc_pkt_len_init(), we end up
> with hdr_len = 28 (IPv4 header + UDP header), matching skb->len
>
> Then the following sets gso_segs to 0 :
>
> gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
> shinfo->gso_size);
>
> Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/
>
> qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
>
> This leads to the following crash in fq_codel [1]
>
> qdisc_pkt_len_init() is best effort, we only want an estimation
> of the bytes sent on the wire, not crashing the kernel.
>
> This patch is fixing this particular issue, a following one
> adds more sanity checks for another potential bug.
>
> [1]
> [ 70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 70.724561] #PF: supervisor read access in kernel mode
> [ 70.724561] #PF: error_code(0x0000) - not-present page
> [ 70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
> [ 70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
> [ 70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
> [ 70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
> [ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
> All code
> ========
> 0: 24 08 and $0x8,%al
> 2: 49 c1 e1 06 shl $0x6,%r9
> 6: 44 89 7c 24 18 mov %r15d,0x18(%rsp)
> b: 45 31 ed xor %r13d,%r13d
> e: 45 31 c0 xor %r8d,%r8d
> 11: 31 ff xor %edi,%edi
> 13: 89 44 24 14 mov %eax,0x14(%rsp)
> 17: 4c 03 8b 90 01 00 00 add 0x190(%rbx),%r9
> 1e: eb 04 jmp 0x24
> 20: 39 ca cmp %ecx,%edx
> 22: 73 37 jae 0x5b
> 24: 4d 8b 39 mov (%r9),%r15
> 27: 83 c7 01 add $0x1,%edi
> 2a:* 49 8b 17 mov (%r15),%rdx <-- trapping instruction
> 2d: 49 89 11 mov %rdx,(%r9)
> 30: 41 8b 57 28 mov 0x28(%r15),%edx
> 34: 45 8b 5f 34 mov 0x34(%r15),%r11d
> 38: 49 c7 07 00 00 00 00 movq $0x0,(%r15)
> 3f: 49 rex.WB
>
> Code starting with the faulting instruction
> ===========================================
> 0: 49 8b 17 mov (%r15),%rdx
> 3: 49 89 11 mov %rdx,(%r9)
> 6: 41 8b 57 28 mov 0x28(%r15),%edx
> a: 45 8b 5f 34 mov 0x34(%r15),%r11d
> e: 49 c7 07 00 00 00 00 movq $0x0,(%r15)
> 15: 49 rex.WB
> [ 70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
> [ 70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
> [ 70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
> [ 70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
> [ 70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
> [ 70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
> [ 70.724561] FS: 000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
> [ 70.724561] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
> [ 70.724561] Call Trace:
> [ 70.724561] <TASK>
> [ 70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
> [ 70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
> [ 70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
> [ 70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
> [ 70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
> [ 70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
> [ 70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
> [ 70.724561] ? irqentry_enter (kernel/entry/common.c:237)
> [ 70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
> [ 70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
> [ 70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
> [ 70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
> [ 70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
> [ 70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
> [ 70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
> [ 70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
> [ 70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
> [ 70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
> [ 70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
> [ 70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> [ 70.724561] RIP: 0033:0x41ae09
>
> Fixes: cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count transport header in UFO")
> Reported-by: syzbot <syzkaller@googlegroups.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jonathan Davies <jonathan.davies@nutanix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-24 15:02 ` [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init() Eric Dumazet
2024-09-25 17:51 ` Joe Damato
@ 2024-09-26 9:13 ` Willem de Bruijn
2024-09-26 9:17 ` Willem de Bruijn
1 sibling, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-26 9:13 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
eric.dumazet, Eric Dumazet
Eric Dumazet wrote:
> One path takes care of SKB_GSO_DODGY, assuming
> skb->len is bigger than hdr_len.
>
> virtio_net_hdr_to_skb() does not fully dissect TCP headers,
> it only make sure it is at least 20 bytes.
>
> It is possible for an user to provide a malicious 'GSO' packet,
> total length of 80 bytes.
>
> - 20 bytes of IPv4 header
> - 60 bytes TCP header
> - a small gso_size like 8
>
> virtio_net_hdr_to_skb() would declare this packet as a normal
> GSO packet, because it would see 40 bytes of payload,
> bigger than gso_size.
>
> We need to make detect this case to not underflow
> qdisc_skb_cb(skb)->pkt_len.
>
> Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
> net/core/dev.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index f2c47da79f17d5ebe6b334b63d66c84c84c519fc..35b8bcfb209bd274c81380eaf6e445641306b018 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3766,10 +3766,14 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
> hdr_len += sizeof(struct udphdr);
> }
>
> - if (shinfo->gso_type & SKB_GSO_DODGY)
> - gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
> - shinfo->gso_size);
> + if (unlikely(shinfo->gso_type & SKB_GSO_DODGY)) {
> + int payload = skb->len - hdr_len;
>
> + /* Malicious packet. */
> + if (payload <= 0)
> + return;
> + gso_segs = DIV_ROUND_UP(payload, shinfo->gso_size);
> + }
Especially for a malicious packet, should gso_segs be reinitialized to
a sane value? As sane as feasible when other fields cannot be fully
trusted..
> qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
> }
> }
> --
> 2.46.0.792.g87dc391469-goog
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-26 9:13 ` Willem de Bruijn
@ 2024-09-26 9:17 ` Willem de Bruijn
2024-09-26 9:19 ` Eric Dumazet
0 siblings, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-26 9:17 UTC (permalink / raw)
To: Willem de Bruijn, Eric Dumazet, David S . Miller, Jakub Kicinski,
Paolo Abeni
Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
eric.dumazet, Eric Dumazet
Willem de Bruijn wrote:
> Eric Dumazet wrote:
> > One path takes care of SKB_GSO_DODGY, assuming
> > skb->len is bigger than hdr_len.
> >
> > virtio_net_hdr_to_skb() does not fully dissect TCP headers,
> > it only make sure it is at least 20 bytes.
> >
> > It is possible for an user to provide a malicious 'GSO' packet,
> > total length of 80 bytes.
> >
> > - 20 bytes of IPv4 header
> > - 60 bytes TCP header
> > - a small gso_size like 8
> >
> > virtio_net_hdr_to_skb() would declare this packet as a normal
> > GSO packet, because it would see 40 bytes of payload,
> > bigger than gso_size.
> >
> > We need to make detect this case to not underflow
> > qdisc_skb_cb(skb)->pkt_len.
> >
> > Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > ---
> > net/core/dev.c | 10 +++++++---
> > 1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index f2c47da79f17d5ebe6b334b63d66c84c84c519fc..35b8bcfb209bd274c81380eaf6e445641306b018 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3766,10 +3766,14 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
> > hdr_len += sizeof(struct udphdr);
> > }
> >
> > - if (shinfo->gso_type & SKB_GSO_DODGY)
> > - gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
> > - shinfo->gso_size);
> > + if (unlikely(shinfo->gso_type & SKB_GSO_DODGY)) {
> > + int payload = skb->len - hdr_len;
> >
> > + /* Malicious packet. */
> > + if (payload <= 0)
> > + return;
> > + gso_segs = DIV_ROUND_UP(payload, shinfo->gso_size);
> > + }
>
> Especially for a malicious packet, should gso_segs be reinitialized to
> a sane value? As sane as feasible when other fields cannot be fully
> trusted..
Never mind. I guess the best thing we can do is to leave pkt_len as
skb->len, indeed.
Reviewed-by: Willem de Bruijn <willemb@google.com>
> > qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
> > }
> > }
> > --
> > 2.46.0.792.g87dc391469-goog
> >
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-26 9:17 ` Willem de Bruijn
@ 2024-09-26 9:19 ` Eric Dumazet
2024-10-01 9:57 ` Paolo Abeni
0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2024-09-26 9:19 UTC (permalink / raw)
To: Willem de Bruijn
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet
On Thu, Sep 26, 2024 at 11:17 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Willem de Bruijn wrote:
> > Eric Dumazet wrote:
> > > One path takes care of SKB_GSO_DODGY, assuming
> > > skb->len is bigger than hdr_len.
> > >
> > > virtio_net_hdr_to_skb() does not fully dissect TCP headers,
> > > it only make sure it is at least 20 bytes.
> > >
> > > It is possible for an user to provide a malicious 'GSO' packet,
> > > total length of 80 bytes.
> > >
> > > - 20 bytes of IPv4 header
> > > - 60 bytes TCP header
> > > - a small gso_size like 8
> > >
> > > virtio_net_hdr_to_skb() would declare this packet as a normal
> > > GSO packet, because it would see 40 bytes of payload,
> > > bigger than gso_size.
> > >
> > > We need to make detect this case to not underflow
> > > qdisc_skb_cb(skb)->pkt_len.
> > >
> > > Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > ---
> > > net/core/dev.c | 10 +++++++---
> > > 1 file changed, 7 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index f2c47da79f17d5ebe6b334b63d66c84c84c519fc..35b8bcfb209bd274c81380eaf6e445641306b018 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -3766,10 +3766,14 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
> > > hdr_len += sizeof(struct udphdr);
> > > }
> > >
> > > - if (shinfo->gso_type & SKB_GSO_DODGY)
> > > - gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
> > > - shinfo->gso_size);
> > > + if (unlikely(shinfo->gso_type & SKB_GSO_DODGY)) {
> > > + int payload = skb->len - hdr_len;
> > >
> > > + /* Malicious packet. */
> > > + if (payload <= 0)
> > > + return;
> > > + gso_segs = DIV_ROUND_UP(payload, shinfo->gso_size);
> > > + }
> >
> > Especially for a malicious packet, should gso_segs be reinitialized to
> > a sane value? As sane as feasible when other fields cannot be fully
> > trusted..
>
> Never mind. I guess the best thing we can do is to leave pkt_len as
> skb->len, indeed.
>
It is unclear if we can change a field in skb here, I hope that in the
future we can make virtio_net_hdr_to_skb() safer.
Role of qdisc_pkt_len_init() is to set a private skb->cb[] field for
qdisc layer.
Thanks !
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO
2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
2024-09-26 9:07 ` Willem de Bruijn
@ 2024-09-26 18:00 ` Jonathan Davies
1 sibling, 0 replies; 20+ messages in thread
From: Jonathan Davies @ 2024-09-26 18:00 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: David Ahern, netdev, Willem de Bruijn, eric.dumazet, syzbot
On 24/09/2024 16:02, Eric Dumazet wrote:
> After commit 7c6d2ecbda83 ("net: be more gentle about silly gso
> requests coming from user") virtio_net_hdr_to_skb() had sanity check
> to detect malicious attempts from user space to cook a bad GSO packet.
>
> Then commit cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count
> transport header in UFO") while fixing one issue
I've tested that the issue fixed by commit cf9acc90c80ec isn't adversely
affected by your patch, so:
Reviewed-by: Jonathan Davies <jonathan.davies@nutanix.com>
>, allowed user space
> to cook a GSO packet with the following characteristic :
>
> IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.
>
> When this packet arrives in qdisc_pkt_len_init(), we end up
> with hdr_len = 28 (IPv4 header + UDP header), matching skb->len
>
> Then the following sets gso_segs to 0 :
>
> gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
> shinfo->gso_size);
>
> Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/
>
> qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
>
> This leads to the following crash in fq_codel [1]
>
> qdisc_pkt_len_init() is best effort, we only want an estimation
> of the bytes sent on the wire, not crashing the kernel.
>
> This patch is fixing this particular issue, a following one
> adds more sanity checks for another potential bug.
>
> [1]
> [ 70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 70.724561] #PF: supervisor read access in kernel mode
> [ 70.724561] #PF: error_code(0x0000) - not-present page
> [ 70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
> [ 70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
> [ 70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
> [ 70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
> [ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
> All code
> ========
> 0: 24 08 and $0x8,%al
> 2: 49 c1 e1 06 shl $0x6,%r9
> 6: 44 89 7c 24 18 mov %r15d,0x18(%rsp)
> b: 45 31 ed xor %r13d,%r13d
> e: 45 31 c0 xor %r8d,%r8d
> 11: 31 ff xor %edi,%edi
> 13: 89 44 24 14 mov %eax,0x14(%rsp)
> 17: 4c 03 8b 90 01 00 00 add 0x190(%rbx),%r9
> 1e: eb 04 jmp 0x24
> 20: 39 ca cmp %ecx,%edx
> 22: 73 37 jae 0x5b
> 24: 4d 8b 39 mov (%r9),%r15
> 27: 83 c7 01 add $0x1,%edi
> 2a:* 49 8b 17 mov (%r15),%rdx <-- trapping instruction
> 2d: 49 89 11 mov %rdx,(%r9)
> 30: 41 8b 57 28 mov 0x28(%r15),%edx
> 34: 45 8b 5f 34 mov 0x34(%r15),%r11d
> 38: 49 c7 07 00 00 00 00 movq $0x0,(%r15)
> 3f: 49 rex.WB
>
> Code starting with the faulting instruction
> ===========================================
> 0: 49 8b 17 mov (%r15),%rdx
> 3: 49 89 11 mov %rdx,(%r9)
> 6: 41 8b 57 28 mov 0x28(%r15),%edx
> a: 45 8b 5f 34 mov 0x34(%r15),%r11d
> e: 49 c7 07 00 00 00 00 movq $0x0,(%r15)
> 15: 49 rex.WB
> [ 70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
> [ 70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
> [ 70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
> [ 70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
> [ 70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
> [ 70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
> [ 70.724561] FS: 000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
> [ 70.724561] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
> [ 70.724561] Call Trace:
> [ 70.724561] <TASK>
> [ 70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
> [ 70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
> [ 70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
> [ 70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
> [ 70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
> [ 70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
> [ 70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
> [ 70.724561] ? irqentry_enter (kernel/entry/common.c:237)
> [ 70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
> [ 70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
> [ 70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
> [ 70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
> [ 70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
> [ 70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
> [ 70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
> [ 70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
> [ 70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
> [ 70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
> [ 70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
> [ 70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> [ 70.724561] RIP: 0033:0x41ae09
>
> Fixes: cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count transport header in UFO")
> Reported-by: syzbot <syzkaller@googlegroups.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jonathan Davies <jonathan.davies@nutanix.com>
> ---
> net/core/dev.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 1e740faf9e783b047b2dc7d9fd4242e4e6c7317a..f2c47da79f17d5ebe6b334b63d66c84c84c519fc 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3758,7 +3758,7 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
> sizeof(_tcphdr), &_tcphdr);
> if (likely(th))
> hdr_len += __tcp_hdrlen(th);
> - } else {
> + } else if (shinfo->gso_type & SKB_GSO_UDP_L4) {
> struct udphdr _udphdr;
>
> if (skb_header_pointer(skb, hdr_len,
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
2024-09-26 9:19 ` Eric Dumazet
@ 2024-10-01 9:57 ` Paolo Abeni
0 siblings, 0 replies; 20+ messages in thread
From: Paolo Abeni @ 2024-10-01 9:57 UTC (permalink / raw)
To: Eric Dumazet, Willem de Bruijn
Cc: David S . Miller, Jakub Kicinski, David Ahern, netdev,
Willem de Bruijn, Jonathan Davies, eric.dumazet
On 9/26/24 11:19, Eric Dumazet wrote:
> On Thu, Sep 26, 2024 at 11:17 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>>
>> Willem de Bruijn wrote:
>>> Eric Dumazet wrote:
>>>> One path takes care of SKB_GSO_DODGY, assuming
>>>> skb->len is bigger than hdr_len.
>>>>
>>>> virtio_net_hdr_to_skb() does not fully dissect TCP headers,
>>>> it only make sure it is at least 20 bytes.
>>>>
>>>> It is possible for an user to provide a malicious 'GSO' packet,
>>>> total length of 80 bytes.
>>>>
>>>> - 20 bytes of IPv4 header
>>>> - 60 bytes TCP header
>>>> - a small gso_size like 8
>>>>
>>>> virtio_net_hdr_to_skb() would declare this packet as a normal
>>>> GSO packet, because it would see 40 bytes of payload,
>>>> bigger than gso_size.
>>>>
>>>> We need to make detect this case to not underflow
>>>> qdisc_skb_cb(skb)->pkt_len.
>>>>
>>>> Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
>>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>>> ---
>>>> net/core/dev.c | 10 +++++++---
>>>> 1 file changed, 7 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>> index f2c47da79f17d5ebe6b334b63d66c84c84c519fc..35b8bcfb209bd274c81380eaf6e445641306b018 100644
>>>> --- a/net/core/dev.c
>>>> +++ b/net/core/dev.c
>>>> @@ -3766,10 +3766,14 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
>>>> hdr_len += sizeof(struct udphdr);
>>>> }
>>>>
>>>> - if (shinfo->gso_type & SKB_GSO_DODGY)
>>>> - gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
>>>> - shinfo->gso_size);
>>>> + if (unlikely(shinfo->gso_type & SKB_GSO_DODGY)) {
>>>> + int payload = skb->len - hdr_len;
>>>>
>>>> + /* Malicious packet. */
>>>> + if (payload <= 0)
>>>> + return;
>>>> + gso_segs = DIV_ROUND_UP(payload, shinfo->gso_size);
>>>> + }
>>>
>>> Especially for a malicious packet, should gso_segs be reinitialized to
>>> a sane value? As sane as feasible when other fields cannot be fully
>>> trusted..
>>
>> Never mind. I guess the best thing we can do is to leave pkt_len as
>> skb->len, indeed.
>
> It is unclear if we can change a field in skb here, I hope that in the
> future we can make virtio_net_hdr_to_skb() safer.
A problem with virtio_net_hdr_to_skb() is that is needs to cope with
existing implementation bending the virtio spec to its limits, and the
spec definition allowed for some 'fun'. Sanitize everything is likely to
break existing users.
On a somewhat related note, I'm trying to push a new virtio-spec GSO
feature (UDP tunnel support) that will inevitably increase the attack
surface. On the flip side the general idea is to be as strict as
possible with the definition of the new allowed gso types, to try to
avoid the above kind of scenarios (for such gso types).
The above just FYI,
Paolo
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init()
2024-09-24 15:02 [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() Eric Dumazet
` (2 preceding siblings ...)
2024-09-26 2:26 ` [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() David Ahern
@ 2024-10-01 10:10 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 20+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-10-01 10:10 UTC (permalink / raw)
To: Eric Dumazet
Cc: davem, kuba, pabeni, dsahern, netdev, willemb, jonathan.davies,
eric.dumazet
Hello:
This series was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Tue, 24 Sep 2024 15:02:55 +0000 you wrote:
> Inspired by one syzbot report.
>
> At least one qdisc (fq_codel) depends on qdisc_skb_cb(skb)->pkt_len
> having a sane value (not zero)
>
> With the help of af_packet, syzbot was able to fool qdisc_pkt_len_init()
> to precisely set qdisc_skb_cb(skb)->pkt_len to zero.
>
> [...]
Here is the summary with links:
- [net,1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO
https://git.kernel.org/netdev/net/c/c20029db2839
- [net,2/2] net: add more sanity checks to qdisc_pkt_len_init()
https://git.kernel.org/netdev/net/c/ab9a9a9e9647
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-10-01 10:10 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-24 15:02 [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() Eric Dumazet
2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
2024-09-26 9:07 ` Willem de Bruijn
2024-09-26 18:00 ` Jonathan Davies
2024-09-24 15:02 ` [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init() Eric Dumazet
2024-09-25 17:51 ` Joe Damato
2024-09-25 18:00 ` Eric Dumazet
2024-09-25 18:24 ` Joe Damato
2024-09-25 18:27 ` Stephen Hemminger
2024-09-25 18:55 ` Eric Dumazet
2024-09-25 19:01 ` Eric Dumazet
2024-09-25 19:15 ` Joe Damato
2024-09-25 19:07 ` Joe Damato
2024-09-25 19:08 ` Eric Dumazet
2024-09-26 9:13 ` Willem de Bruijn
2024-09-26 9:17 ` Willem de Bruijn
2024-09-26 9:19 ` Eric Dumazet
2024-10-01 9:57 ` Paolo Abeni
2024-09-26 2:26 ` [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() David Ahern
2024-10-01 10:10 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).