netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init()
@ 2024-09-24 15:02 Eric Dumazet
  2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-09-24 15:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
	eric.dumazet, Eric Dumazet

Inspired by one syzbot report.

At least one qdisc (fq_codel) depends on qdisc_skb_cb(skb)->pkt_len
having a sane value (not zero)

With the help of af_packet, syzbot was able to fool qdisc_pkt_len_init()
to precisely set qdisc_skb_cb(skb)->pkt_len to zero.

First patch fixes this issue.

Second one (a separate one to help future bisections) adds
more sanity check to SKB_GSO_DODGY users.

Eric Dumazet (2):
  net: avoid potential underflow in qdisc_pkt_len_init() with UFO
  net: add more sanity checks to qdisc_pkt_len_init()

 net/core/dev.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

-- 
2.46.0.792.g87dc391469-goog


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO
  2024-09-24 15:02 [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() Eric Dumazet
@ 2024-09-24 15:02 ` Eric Dumazet
  2024-09-26  9:07   ` Willem de Bruijn
  2024-09-26 18:00   ` Jonathan Davies
  2024-09-24 15:02 ` [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init() Eric Dumazet
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-09-24 15:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
	eric.dumazet, Eric Dumazet, syzbot

After commit 7c6d2ecbda83 ("net: be more gentle about silly gso
requests coming from user") virtio_net_hdr_to_skb() had sanity check
to detect malicious attempts from user space to cook a bad GSO packet.

Then commit cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count
transport header in UFO") while fixing one issue, allowed user space
to cook a GSO packet with the following characteristic :

IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.

When this packet arrives in qdisc_pkt_len_init(), we end up
with hdr_len = 28 (IPv4 header + UDP header), matching skb->len

Then the following sets gso_segs to 0 :

gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
                        shinfo->gso_size);

Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/

qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;

This leads to the following crash in fq_codel [1]

qdisc_pkt_len_init() is best effort, we only want an estimation
of the bytes sent on the wire, not crashing the kernel.

This patch is fixing this particular issue, a following one
adds more sanity checks for another potential bug.

[1]
[   70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
[   70.724561] #PF: supervisor read access in kernel mode
[   70.724561] #PF: error_code(0x0000) - not-present page
[   70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
[   70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
[   70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
[   70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
[ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
All code
========
   0:	24 08                	and    $0x8,%al
   2:	49 c1 e1 06          	shl    $0x6,%r9
   6:	44 89 7c 24 18       	mov    %r15d,0x18(%rsp)
   b:	45 31 ed             	xor    %r13d,%r13d
   e:	45 31 c0             	xor    %r8d,%r8d
  11:	31 ff                	xor    %edi,%edi
  13:	89 44 24 14          	mov    %eax,0x14(%rsp)
  17:	4c 03 8b 90 01 00 00 	add    0x190(%rbx),%r9
  1e:	eb 04                	jmp    0x24
  20:	39 ca                	cmp    %ecx,%edx
  22:	73 37                	jae    0x5b
  24:	4d 8b 39             	mov    (%r9),%r15
  27:	83 c7 01             	add    $0x1,%edi
  2a:*	49 8b 17             	mov    (%r15),%rdx		<-- trapping instruction
  2d:	49 89 11             	mov    %rdx,(%r9)
  30:	41 8b 57 28          	mov    0x28(%r15),%edx
  34:	45 8b 5f 34          	mov    0x34(%r15),%r11d
  38:	49 c7 07 00 00 00 00 	movq   $0x0,(%r15)
  3f:	49                   	rex.WB

Code starting with the faulting instruction
===========================================
   0:	49 8b 17             	mov    (%r15),%rdx
   3:	49 89 11             	mov    %rdx,(%r9)
   6:	41 8b 57 28          	mov    0x28(%r15),%edx
   a:	45 8b 5f 34          	mov    0x34(%r15),%r11d
   e:	49 c7 07 00 00 00 00 	movq   $0x0,(%r15)
  15:	49                   	rex.WB
[   70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
[   70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
[   70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
[   70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
[   70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
[   70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
[   70.724561] FS:  000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
[   70.724561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
[   70.724561] Call Trace:
[   70.724561]  <TASK>
[   70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[   70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
[   70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
[   70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
[   70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
[   70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
[   70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
[   70.724561] ? irqentry_enter (kernel/entry/common.c:237)
[   70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
[   70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
[   70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
[   70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
[   70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
[   70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
[   70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
[   70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
[   70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
[   70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
[   70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
[   70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[   70.724561] RIP: 0033:0x41ae09

Fixes: cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count transport header in UFO")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jonathan Davies <jonathan.davies@nutanix.com>
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 1e740faf9e783b047b2dc7d9fd4242e4e6c7317a..f2c47da79f17d5ebe6b334b63d66c84c84c519fc 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3758,7 +3758,7 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
 						sizeof(_tcphdr), &_tcphdr);
 			if (likely(th))
 				hdr_len += __tcp_hdrlen(th);
-		} else {
+		} else if (shinfo->gso_type & SKB_GSO_UDP_L4) {
 			struct udphdr _udphdr;
 
 			if (skb_header_pointer(skb, hdr_len,
-- 
2.46.0.792.g87dc391469-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-24 15:02 [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() Eric Dumazet
  2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
@ 2024-09-24 15:02 ` Eric Dumazet
  2024-09-25 17:51   ` Joe Damato
  2024-09-26  9:13   ` Willem de Bruijn
  2024-09-26  2:26 ` [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() David Ahern
  2024-10-01 10:10 ` patchwork-bot+netdevbpf
  3 siblings, 2 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-09-24 15:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
	eric.dumazet, Eric Dumazet

One path takes care of SKB_GSO_DODGY, assuming
skb->len is bigger than hdr_len.

virtio_net_hdr_to_skb() does not fully dissect TCP headers,
it only make sure it is at least 20 bytes.

It is possible for an user to provide a malicious 'GSO' packet,
total length of 80 bytes.

- 20 bytes of IPv4 header
- 60 bytes TCP header
- a small gso_size like 8

virtio_net_hdr_to_skb() would declare this packet as a normal
GSO packet, because it would see 40 bytes of payload,
bigger than gso_size.

We need to make detect this case to not underflow
qdisc_skb_cb(skb)->pkt_len.

Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/dev.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index f2c47da79f17d5ebe6b334b63d66c84c84c519fc..35b8bcfb209bd274c81380eaf6e445641306b018 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3766,10 +3766,14 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
 				hdr_len += sizeof(struct udphdr);
 		}
 
-		if (shinfo->gso_type & SKB_GSO_DODGY)
-			gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
-						shinfo->gso_size);
+		if (unlikely(shinfo->gso_type & SKB_GSO_DODGY)) {
+			int payload = skb->len - hdr_len;
 
+			/* Malicious packet. */
+			if (payload <= 0)
+				return;
+			gso_segs = DIV_ROUND_UP(payload, shinfo->gso_size);
+		}
 		qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
 	}
 }
-- 
2.46.0.792.g87dc391469-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-24 15:02 ` [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init() Eric Dumazet
@ 2024-09-25 17:51   ` Joe Damato
  2024-09-25 18:00     ` Eric Dumazet
  2024-09-26  9:13   ` Willem de Bruijn
  1 sibling, 1 reply; 20+ messages in thread
From: Joe Damato @ 2024-09-25 17:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
	netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet

On Tue, Sep 24, 2024 at 03:02:57PM +0000, Eric Dumazet wrote:
> One path takes care of SKB_GSO_DODGY, assuming
> skb->len is bigger than hdr_len.

My only comment, which you may feel free to ignore, is that we've
recently merged a change to replace the term 'sanity check' in the
code [1].

Given that work is being done to replace terminology in the source
code, I am wondering if that same ruling applies to commit messages.

If so, perhaps the title of this commit can be adjusted?

[1]: https://lore.kernel.org/netdev/20240912171446.12854-1-stephen@networkplumber.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-25 17:51   ` Joe Damato
@ 2024-09-25 18:00     ` Eric Dumazet
  2024-09-25 18:24       ` Joe Damato
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2024-09-25 18:00 UTC (permalink / raw)
  To: Joe Damato, Eric Dumazet, David S . Miller, Jakub Kicinski,
	Paolo Abeni, David Ahern, netdev, Willem de Bruijn,
	Jonathan Davies, eric.dumazet

On Wed, Sep 25, 2024 at 7:52 PM Joe Damato <jdamato@fastly.com> wrote:
>
> On Tue, Sep 24, 2024 at 03:02:57PM +0000, Eric Dumazet wrote:
> > One path takes care of SKB_GSO_DODGY, assuming
> > skb->len is bigger than hdr_len.
>
> My only comment, which you may feel free to ignore, is that we've
> recently merged a change to replace the term 'sanity check' in the
> code [1].
>
> Given that work is being done to replace terminology in the source
> code, I am wondering if that same ruling applies to commit messages.
>
> If so, perhaps the title of this commit can be adjusted?
>
> [1]: https://lore.kernel.org/netdev/20240912171446.12854-1-stephen@networkplumber.org/

I guess I could write the changelog in French, to make sure it is all good.

git log --oneline --grep "sanity check" | wc -l
3397

I dunno...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-25 18:00     ` Eric Dumazet
@ 2024-09-25 18:24       ` Joe Damato
  2024-09-25 18:27         ` Stephen Hemminger
  2024-09-25 18:55         ` Eric Dumazet
  0 siblings, 2 replies; 20+ messages in thread
From: Joe Damato @ 2024-09-25 18:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
	netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet

On Wed, Sep 25, 2024 at 08:00:08PM +0200, Eric Dumazet wrote:
> On Wed, Sep 25, 2024 at 7:52 PM Joe Damato <jdamato@fastly.com> wrote:
> >
> > On Tue, Sep 24, 2024 at 03:02:57PM +0000, Eric Dumazet wrote:
> > > One path takes care of SKB_GSO_DODGY, assuming
> > > skb->len is bigger than hdr_len.
> >
> > My only comment, which you may feel free to ignore, is that we've
> > recently merged a change to replace the term 'sanity check' in the
> > code [1].
> >
> > Given that work is being done to replace terminology in the source
> > code, I am wondering if that same ruling applies to commit messages.
> >
> > If so, perhaps the title of this commit can be adjusted?
> >
> > [1]: https://lore.kernel.org/netdev/20240912171446.12854-1-stephen@networkplumber.org/
> 
> I guess I could write the changelog in French, to make sure it is all good.

You could, but then I'd probably email you and remind you that the
documentation explicitly says "It’s important to describe the change
in plain English" [1].

> git log --oneline --grep "sanity check" | wc -l
> 3397

I don't know what this means. We've done it in the past and so
should continue to do it in the future? OK.

> I dunno...

Me either, but if code is being merged as recently as a few days ago
to remove this term...

[1]: https://www.kernel.org/doc/html/v6.11/process/submitting-patches.html#describe-your-changes

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-25 18:24       ` Joe Damato
@ 2024-09-25 18:27         ` Stephen Hemminger
  2024-09-25 18:55         ` Eric Dumazet
  1 sibling, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2024-09-25 18:27 UTC (permalink / raw)
  To: Joe Damato
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni,
	David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
	eric.dumazet

On Wed, 25 Sep 2024 11:24:04 -0700
Joe Damato <jdamato@fastly.com> wrote:

> On Wed, Sep 25, 2024 at 08:00:08PM +0200, Eric Dumazet wrote:
> > On Wed, Sep 25, 2024 at 7:52 PM Joe Damato <jdamato@fastly.com> wrote:  
> > >
> > > On Tue, Sep 24, 2024 at 03:02:57PM +0000, Eric Dumazet wrote:  
> > > > One path takes care of SKB_GSO_DODGY, assuming
> > > > skb->len is bigger than hdr_len.  
> > >
> > > My only comment, which you may feel free to ignore, is that we've
> > > recently merged a change to replace the term 'sanity check' in the
> > > code [1].
> > >
> > > Given that work is being done to replace terminology in the source
> > > code, I am wondering if that same ruling applies to commit messages.
> > >
> > > If so, perhaps the title of this commit can be adjusted?
> > >
> > > [1]: https://lore.kernel.org/netdev/20240912171446.12854-1-stephen@networkplumber.org/  
> > 
> > I guess I could write the changelog in French, to make sure it is all good.  
> 
> You could, but then I'd probably email you and remind you that the
> documentation explicitly says "It’s important to describe the change
> in plain English" [1].
> 
> > git log --oneline --grep "sanity check" | wc -l
> > 3397  
> 
> I don't know what this means. We've done it in the past and so
> should continue to do it in the future? OK.
> 
> > I dunno...  
> 
> Me either, but if code is being merged as recently as a few days ago
> to remove this term...
> 
> [1]: https://www.kernel.org/doc/html/v6.11/process/submitting-patches.html#describe-your-changes
> 

Less concerned about commit log or subject.
My patch was more about keeping non-inclusive naming out of current code base.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-25 18:24       ` Joe Damato
  2024-09-25 18:27         ` Stephen Hemminger
@ 2024-09-25 18:55         ` Eric Dumazet
  2024-09-25 19:01           ` Eric Dumazet
  2024-09-25 19:07           ` Joe Damato
  1 sibling, 2 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-09-25 18:55 UTC (permalink / raw)
  To: Joe Damato, Eric Dumazet, David S . Miller, Jakub Kicinski,
	Paolo Abeni, David Ahern, netdev, Willem de Bruijn,
	Jonathan Davies, eric.dumazet

On Wed, Sep 25, 2024 at 8:24 PM Joe Damato <jdamato@fastly.com> wrote:
>

>
> > git log --oneline --grep "sanity check" | wc -l
> > 3397
>
> I don't know what this means. We've done it in the past and so
> should continue to do it in the future? OK.

This means that if they are in the changelogs, they can not be removed.
This is immutable stuff.
Should we zap git history because of some 'bad words' that most
authors/committers/reviewers were not even aware of?

I would understand for stuff visible in the code (comments, error messages),
but the changelogs are there and can not be changed.

Who knows, maybe in 10 years 'Malicious packet.' will be very offensive,
then we can remove/change the _comment_ I added in this patch.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-25 18:55         ` Eric Dumazet
@ 2024-09-25 19:01           ` Eric Dumazet
  2024-09-25 19:15             ` Joe Damato
  2024-09-25 19:07           ` Joe Damato
  1 sibling, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2024-09-25 19:01 UTC (permalink / raw)
  To: Joe Damato, Eric Dumazet, David S . Miller, Jakub Kicinski,
	Paolo Abeni, David Ahern, netdev, Willem de Bruijn,
	Jonathan Davies, eric.dumazet

On Wed, Sep 25, 2024 at 8:55 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Sep 25, 2024 at 8:24 PM Joe Damato <jdamato@fastly.com> wrote:
> >
>
> >
> > > git log --oneline --grep "sanity check" | wc -l
> > > 3397
> >
> > I don't know what this means. We've done it in the past and so
> > should continue to do it in the future? OK.
>
> This means that if they are in the changelogs, they can not be removed.
> This is immutable stuff.
> Should we zap git history because of some 'bad words' that most
> authors/committers/reviewers were not even aware of?
>
> I would understand for stuff visible in the code (comments, error messages),
> but the changelogs are there and can not be changed.
>
> Who knows, maybe in 10 years 'Malicious packet.' will be very offensive,
> then we can remove/change the _comment_ I added in this patch.

BTW, I looked at https://en.wikipedia.org/wiki/Sanity_check
and the non inclusive part is at the very end of it.

I would suggest moving it at the beginning of the article to clearly
educate all potential users.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-25 18:55         ` Eric Dumazet
  2024-09-25 19:01           ` Eric Dumazet
@ 2024-09-25 19:07           ` Joe Damato
  2024-09-25 19:08             ` Eric Dumazet
  1 sibling, 1 reply; 20+ messages in thread
From: Joe Damato @ 2024-09-25 19:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
	netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet

On Wed, Sep 25, 2024 at 08:55:16PM +0200, Eric Dumazet wrote:
> On Wed, Sep 25, 2024 at 8:24 PM Joe Damato <jdamato@fastly.com> wrote:
> >
> 
> >
> > > git log --oneline --grep "sanity check" | wc -l
> > > 3397
> >
> > I don't know what this means. We've done it in the past and so
> > should continue to do it in the future? OK.
> 
> This means that if they are in the changelogs, they can not be removed.
> This is immutable stuff.
> Should we zap git history because of some 'bad words' that most
> authors/committers/reviewers were not even aware of?

I never suggested that. I suggested not adding more to the
changelog and also mentioned you could feel free to ignore my
comment.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-25 19:07           ` Joe Damato
@ 2024-09-25 19:08             ` Eric Dumazet
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2024-09-25 19:08 UTC (permalink / raw)
  To: Joe Damato, Eric Dumazet, David S . Miller, Jakub Kicinski,
	Paolo Abeni, David Ahern, netdev, Willem de Bruijn,
	Jonathan Davies, eric.dumazet

On Wed, Sep 25, 2024 at 9:07 PM Joe Damato <jdamato@fastly.com> wrote:
>
> On Wed, Sep 25, 2024 at 08:55:16PM +0200, Eric Dumazet wrote:
> > On Wed, Sep 25, 2024 at 8:24 PM Joe Damato <jdamato@fastly.com> wrote:
> > >
> >
> > >
> > > > git log --oneline --grep "sanity check" | wc -l
> > > > 3397
> > >
> > > I don't know what this means. We've done it in the past and so
> > > should continue to do it in the future? OK.
> >
> > This means that if they are in the changelogs, they can not be removed.
> > This is immutable stuff.
> > Should we zap git history because of some 'bad words' that most
> > authors/committers/reviewers were not even aware of?
>
> I never suggested that. I suggested not adding more to the
> changelog and also mentioned you could feel free to ignore my
> comment.

I do not ignore comments, I learnt something new today, thanks !

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-25 19:01           ` Eric Dumazet
@ 2024-09-25 19:15             ` Joe Damato
  0 siblings, 0 replies; 20+ messages in thread
From: Joe Damato @ 2024-09-25 19:15 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
	netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet

On Wed, Sep 25, 2024 at 09:01:23PM +0200, Eric Dumazet wrote:
> On Wed, Sep 25, 2024 at 8:55 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Wed, Sep 25, 2024 at 8:24 PM Joe Damato <jdamato@fastly.com> wrote:
> > >
> >
> > >
> > > > git log --oneline --grep "sanity check" | wc -l
> > > > 3397
> > >
> > > I don't know what this means. We've done it in the past and so
> > > should continue to do it in the future? OK.
> >
> > This means that if they are in the changelogs, they can not be removed.
> > This is immutable stuff.
> > Should we zap git history because of some 'bad words' that most
> > authors/committers/reviewers were not even aware of?
> >
> > I would understand for stuff visible in the code (comments, error messages),
> > but the changelogs are there and can not be changed.
> >
> > Who knows, maybe in 10 years 'Malicious packet.' will be very offensive,
> > then we can remove/change the _comment_ I added in this patch.
> 
> BTW, I looked at https://en.wikipedia.org/wiki/Sanity_check
> and the non inclusive part is at the very end of it.
> 
> I would suggest moving it at the beginning of the article to clearly
> educate all potential users.

Stephen has provided what was needed, which was: guidance. The
guidance appears to be that code avoid this phrase, but commit
messages are of less concern.

This was not clear to me before, but it is very clear now and I have
learned something for future reviews.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init()
  2024-09-24 15:02 [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() Eric Dumazet
  2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
  2024-09-24 15:02 ` [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init() Eric Dumazet
@ 2024-09-26  2:26 ` David Ahern
  2024-10-01 10:10 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 20+ messages in thread
From: David Ahern @ 2024-09-26  2:26 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet

On 9/24/24 9:02 AM, Eric Dumazet wrote:
> Inspired by one syzbot report.
> 
> At least one qdisc (fq_codel) depends on qdisc_skb_cb(skb)->pkt_len
> having a sane value (not zero)
> 
> With the help of af_packet, syzbot was able to fool qdisc_pkt_len_init()
> to precisely set qdisc_skb_cb(skb)->pkt_len to zero.
> 
> First patch fixes this issue.
> 
> Second one (a separate one to help future bisections) adds
> more sanity check to SKB_GSO_DODGY users.
> 
> Eric Dumazet (2):
>   net: avoid potential underflow in qdisc_pkt_len_init() with UFO
>   net: add more sanity checks to qdisc_pkt_len_init()
> 
>  net/core/dev.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 

LGTM. For the set:

Reviewed-by: David Ahern <dsahern@kernel.org>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO
  2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
@ 2024-09-26  9:07   ` Willem de Bruijn
  2024-09-26 18:00   ` Jonathan Davies
  1 sibling, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-26  9:07 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
	eric.dumazet, Eric Dumazet, syzbot

Eric Dumazet wrote:
> After commit 7c6d2ecbda83 ("net: be more gentle about silly gso
> requests coming from user") virtio_net_hdr_to_skb() had sanity check
> to detect malicious attempts from user space to cook a bad GSO packet.
> 
> Then commit cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count
> transport header in UFO") while fixing one issue, allowed user space
> to cook a GSO packet with the following characteristic :
> 
> IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.
> 
> When this packet arrives in qdisc_pkt_len_init(), we end up
> with hdr_len = 28 (IPv4 header + UDP header), matching skb->len
> 
> Then the following sets gso_segs to 0 :
> 
> gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
>                         shinfo->gso_size);
> 
> Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/
> 
> qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
> 
> This leads to the following crash in fq_codel [1]
> 
> qdisc_pkt_len_init() is best effort, we only want an estimation
> of the bytes sent on the wire, not crashing the kernel.
> 
> This patch is fixing this particular issue, a following one
> adds more sanity checks for another potential bug.
> 
> [1]
> [   70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [   70.724561] #PF: supervisor read access in kernel mode
> [   70.724561] #PF: error_code(0x0000) - not-present page
> [   70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
> [   70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
> [   70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
> [   70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [   70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
> [ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
> All code
> ========
>    0:	24 08                	and    $0x8,%al
>    2:	49 c1 e1 06          	shl    $0x6,%r9
>    6:	44 89 7c 24 18       	mov    %r15d,0x18(%rsp)
>    b:	45 31 ed             	xor    %r13d,%r13d
>    e:	45 31 c0             	xor    %r8d,%r8d
>   11:	31 ff                	xor    %edi,%edi
>   13:	89 44 24 14          	mov    %eax,0x14(%rsp)
>   17:	4c 03 8b 90 01 00 00 	add    0x190(%rbx),%r9
>   1e:	eb 04                	jmp    0x24
>   20:	39 ca                	cmp    %ecx,%edx
>   22:	73 37                	jae    0x5b
>   24:	4d 8b 39             	mov    (%r9),%r15
>   27:	83 c7 01             	add    $0x1,%edi
>   2a:*	49 8b 17             	mov    (%r15),%rdx		<-- trapping instruction
>   2d:	49 89 11             	mov    %rdx,(%r9)
>   30:	41 8b 57 28          	mov    0x28(%r15),%edx
>   34:	45 8b 5f 34          	mov    0x34(%r15),%r11d
>   38:	49 c7 07 00 00 00 00 	movq   $0x0,(%r15)
>   3f:	49                   	rex.WB
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	49 8b 17             	mov    (%r15),%rdx
>    3:	49 89 11             	mov    %rdx,(%r9)
>    6:	41 8b 57 28          	mov    0x28(%r15),%edx
>    a:	45 8b 5f 34          	mov    0x34(%r15),%r11d
>    e:	49 c7 07 00 00 00 00 	movq   $0x0,(%r15)
>   15:	49                   	rex.WB
> [   70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
> [   70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
> [   70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
> [   70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
> [   70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
> [   70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
> [   70.724561] FS:  000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
> [   70.724561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
> [   70.724561] Call Trace:
> [   70.724561]  <TASK>
> [   70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
> [   70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
> [   70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
> [   70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
> [   70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
> [   70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
> [   70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
> [   70.724561] ? irqentry_enter (kernel/entry/common.c:237)
> [   70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
> [   70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
> [   70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
> [   70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
> [   70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
> [   70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
> [   70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
> [   70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
> [   70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
> [   70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
> [   70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
> [   70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> [   70.724561] RIP: 0033:0x41ae09
> 
> Fixes: cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count transport header in UFO")
> Reported-by: syzbot <syzkaller@googlegroups.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jonathan Davies <jonathan.davies@nutanix.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-24 15:02 ` [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init() Eric Dumazet
  2024-09-25 17:51   ` Joe Damato
@ 2024-09-26  9:13   ` Willem de Bruijn
  2024-09-26  9:17     ` Willem de Bruijn
  1 sibling, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-26  9:13 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
	eric.dumazet, Eric Dumazet

Eric Dumazet wrote:
> One path takes care of SKB_GSO_DODGY, assuming
> skb->len is bigger than hdr_len.
> 
> virtio_net_hdr_to_skb() does not fully dissect TCP headers,
> it only make sure it is at least 20 bytes.
> 
> It is possible for an user to provide a malicious 'GSO' packet,
> total length of 80 bytes.
> 
> - 20 bytes of IPv4 header
> - 60 bytes TCP header
> - a small gso_size like 8
> 
> virtio_net_hdr_to_skb() would declare this packet as a normal
> GSO packet, because it would see 40 bytes of payload,
> bigger than gso_size.
> 
> We need to make detect this case to not underflow
> qdisc_skb_cb(skb)->pkt_len.
> 
> Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  net/core/dev.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index f2c47da79f17d5ebe6b334b63d66c84c84c519fc..35b8bcfb209bd274c81380eaf6e445641306b018 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3766,10 +3766,14 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
>  				hdr_len += sizeof(struct udphdr);
>  		}
>  
> -		if (shinfo->gso_type & SKB_GSO_DODGY)
> -			gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
> -						shinfo->gso_size);
> +		if (unlikely(shinfo->gso_type & SKB_GSO_DODGY)) {
> +			int payload = skb->len - hdr_len;
>  
> +			/* Malicious packet. */
> +			if (payload <= 0)
> +				return;
> +			gso_segs = DIV_ROUND_UP(payload, shinfo->gso_size);
> +		}

Especially for a malicious packet, should gso_segs be reinitialized to
a sane value? As sane as feasible when other fields cannot be fully
trusted..

>  		qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
>  	}
>  }
> -- 
> 2.46.0.792.g87dc391469-goog
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-26  9:13   ` Willem de Bruijn
@ 2024-09-26  9:17     ` Willem de Bruijn
  2024-09-26  9:19       ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-26  9:17 UTC (permalink / raw)
  To: Willem de Bruijn, Eric Dumazet, David S . Miller, Jakub Kicinski,
	Paolo Abeni
  Cc: David Ahern, netdev, Willem de Bruijn, Jonathan Davies,
	eric.dumazet, Eric Dumazet

Willem de Bruijn wrote:
> Eric Dumazet wrote:
> > One path takes care of SKB_GSO_DODGY, assuming
> > skb->len is bigger than hdr_len.
> > 
> > virtio_net_hdr_to_skb() does not fully dissect TCP headers,
> > it only make sure it is at least 20 bytes.
> > 
> > It is possible for an user to provide a malicious 'GSO' packet,
> > total length of 80 bytes.
> > 
> > - 20 bytes of IPv4 header
> > - 60 bytes TCP header
> > - a small gso_size like 8
> > 
> > virtio_net_hdr_to_skb() would declare this packet as a normal
> > GSO packet, because it would see 40 bytes of payload,
> > bigger than gso_size.
> > 
> > We need to make detect this case to not underflow
> > qdisc_skb_cb(skb)->pkt_len.
> > 
> > Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > ---
> >  net/core/dev.c | 10 +++++++---
> >  1 file changed, 7 insertions(+), 3 deletions(-)
> > 
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index f2c47da79f17d5ebe6b334b63d66c84c84c519fc..35b8bcfb209bd274c81380eaf6e445641306b018 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3766,10 +3766,14 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
> >  				hdr_len += sizeof(struct udphdr);
> >  		}
> >  
> > -		if (shinfo->gso_type & SKB_GSO_DODGY)
> > -			gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
> > -						shinfo->gso_size);
> > +		if (unlikely(shinfo->gso_type & SKB_GSO_DODGY)) {
> > +			int payload = skb->len - hdr_len;
> >  
> > +			/* Malicious packet. */
> > +			if (payload <= 0)
> > +				return;
> > +			gso_segs = DIV_ROUND_UP(payload, shinfo->gso_size);
> > +		}
> 
> Especially for a malicious packet, should gso_segs be reinitialized to
> a sane value? As sane as feasible when other fields cannot be fully
> trusted..

Never mind. I guess the best thing we can do is to leave pkt_len as
skb->len, indeed.

Reviewed-by: Willem de Bruijn <willemb@google.com>

> >  		qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
> >  	}
> >  }
> > -- 
> > 2.46.0.792.g87dc391469-goog
> > 
> 
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-26  9:17     ` Willem de Bruijn
@ 2024-09-26  9:19       ` Eric Dumazet
  2024-10-01  9:57         ` Paolo Abeni
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2024-09-26  9:19 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
	netdev, Willem de Bruijn, Jonathan Davies, eric.dumazet

On Thu, Sep 26, 2024 at 11:17 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Willem de Bruijn wrote:
> > Eric Dumazet wrote:
> > > One path takes care of SKB_GSO_DODGY, assuming
> > > skb->len is bigger than hdr_len.
> > >
> > > virtio_net_hdr_to_skb() does not fully dissect TCP headers,
> > > it only make sure it is at least 20 bytes.
> > >
> > > It is possible for an user to provide a malicious 'GSO' packet,
> > > total length of 80 bytes.
> > >
> > > - 20 bytes of IPv4 header
> > > - 60 bytes TCP header
> > > - a small gso_size like 8
> > >
> > > virtio_net_hdr_to_skb() would declare this packet as a normal
> > > GSO packet, because it would see 40 bytes of payload,
> > > bigger than gso_size.
> > >
> > > We need to make detect this case to not underflow
> > > qdisc_skb_cb(skb)->pkt_len.
> > >
> > > Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > ---
> > >  net/core/dev.c | 10 +++++++---
> > >  1 file changed, 7 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index f2c47da79f17d5ebe6b334b63d66c84c84c519fc..35b8bcfb209bd274c81380eaf6e445641306b018 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -3766,10 +3766,14 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
> > >                             hdr_len += sizeof(struct udphdr);
> > >             }
> > >
> > > -           if (shinfo->gso_type & SKB_GSO_DODGY)
> > > -                   gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
> > > -                                           shinfo->gso_size);
> > > +           if (unlikely(shinfo->gso_type & SKB_GSO_DODGY)) {
> > > +                   int payload = skb->len - hdr_len;
> > >
> > > +                   /* Malicious packet. */
> > > +                   if (payload <= 0)
> > > +                           return;
> > > +                   gso_segs = DIV_ROUND_UP(payload, shinfo->gso_size);
> > > +           }
> >
> > Especially for a malicious packet, should gso_segs be reinitialized to
> > a sane value? As sane as feasible when other fields cannot be fully
> > trusted..
>
> Never mind. I guess the best thing we can do is to leave pkt_len as
> skb->len, indeed.
>

It is unclear if we can change a field in skb here, I hope that in the
future we can make virtio_net_hdr_to_skb() safer.

Role of qdisc_pkt_len_init() is to set a private skb->cb[] field for
qdisc layer.

Thanks !

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO
  2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
  2024-09-26  9:07   ` Willem de Bruijn
@ 2024-09-26 18:00   ` Jonathan Davies
  1 sibling, 0 replies; 20+ messages in thread
From: Jonathan Davies @ 2024-09-26 18:00 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, Willem de Bruijn, eric.dumazet, syzbot

On 24/09/2024 16:02, Eric Dumazet wrote:
> After commit 7c6d2ecbda83 ("net: be more gentle about silly gso
> requests coming from user") virtio_net_hdr_to_skb() had sanity check
> to detect malicious attempts from user space to cook a bad GSO packet.
> 
> Then commit cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count
> transport header in UFO") while fixing one issue

I've tested that the issue fixed by commit cf9acc90c80ec isn't adversely 
affected by your patch, so:

Reviewed-by: Jonathan Davies <jonathan.davies@nutanix.com>

>, allowed user space
> to cook a GSO packet with the following characteristic :
> 
> IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.
> 
> When this packet arrives in qdisc_pkt_len_init(), we end up
> with hdr_len = 28 (IPv4 header + UDP header), matching skb->len
> 
> Then the following sets gso_segs to 0 :
> 
> gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
>                          shinfo->gso_size);
> 
> Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/
> 
> qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
> 
> This leads to the following crash in fq_codel [1]
> 
> qdisc_pkt_len_init() is best effort, we only want an estimation
> of the bytes sent on the wire, not crashing the kernel.
> 
> This patch is fixing this particular issue, a following one
> adds more sanity checks for another potential bug.
> 
> [1]
> [   70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [   70.724561] #PF: supervisor read access in kernel mode
> [   70.724561] #PF: error_code(0x0000) - not-present page
> [   70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
> [   70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
> [   70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
> [   70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [   70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
> [ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
> All code
> ========
>     0:	24 08                	and    $0x8,%al
>     2:	49 c1 e1 06          	shl    $0x6,%r9
>     6:	44 89 7c 24 18       	mov    %r15d,0x18(%rsp)
>     b:	45 31 ed             	xor    %r13d,%r13d
>     e:	45 31 c0             	xor    %r8d,%r8d
>    11:	31 ff                	xor    %edi,%edi
>    13:	89 44 24 14          	mov    %eax,0x14(%rsp)
>    17:	4c 03 8b 90 01 00 00 	add    0x190(%rbx),%r9
>    1e:	eb 04                	jmp    0x24
>    20:	39 ca                	cmp    %ecx,%edx
>    22:	73 37                	jae    0x5b
>    24:	4d 8b 39             	mov    (%r9),%r15
>    27:	83 c7 01             	add    $0x1,%edi
>    2a:*	49 8b 17             	mov    (%r15),%rdx		<-- trapping instruction
>    2d:	49 89 11             	mov    %rdx,(%r9)
>    30:	41 8b 57 28          	mov    0x28(%r15),%edx
>    34:	45 8b 5f 34          	mov    0x34(%r15),%r11d
>    38:	49 c7 07 00 00 00 00 	movq   $0x0,(%r15)
>    3f:	49                   	rex.WB
> 
> Code starting with the faulting instruction
> ===========================================
>     0:	49 8b 17             	mov    (%r15),%rdx
>     3:	49 89 11             	mov    %rdx,(%r9)
>     6:	41 8b 57 28          	mov    0x28(%r15),%edx
>     a:	45 8b 5f 34          	mov    0x34(%r15),%r11d
>     e:	49 c7 07 00 00 00 00 	movq   $0x0,(%r15)
>    15:	49                   	rex.WB
> [   70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
> [   70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
> [   70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
> [   70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
> [   70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
> [   70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
> [   70.724561] FS:  000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
> [   70.724561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
> [   70.724561] Call Trace:
> [   70.724561]  <TASK>
> [   70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
> [   70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
> [   70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
> [   70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
> [   70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
> [   70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
> [   70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
> [   70.724561] ? irqentry_enter (kernel/entry/common.c:237)
> [   70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
> [   70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
> [   70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
> [   70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
> [   70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
> [   70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
> [   70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
> [   70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
> [   70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
> [   70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
> [   70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
> [   70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> [   70.724561] RIP: 0033:0x41ae09
> 
> Fixes: cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count transport header in UFO")
> Reported-by: syzbot <syzkaller@googlegroups.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jonathan Davies <jonathan.davies@nutanix.com>
> ---
>   net/core/dev.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 1e740faf9e783b047b2dc7d9fd4242e4e6c7317a..f2c47da79f17d5ebe6b334b63d66c84c84c519fc 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3758,7 +3758,7 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
>   						sizeof(_tcphdr), &_tcphdr);
>   			if (likely(th))
>   				hdr_len += __tcp_hdrlen(th);
> -		} else {
> +		} else if (shinfo->gso_type & SKB_GSO_UDP_L4) {
>   			struct udphdr _udphdr;
>   
>   			if (skb_header_pointer(skb, hdr_len,

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init()
  2024-09-26  9:19       ` Eric Dumazet
@ 2024-10-01  9:57         ` Paolo Abeni
  0 siblings, 0 replies; 20+ messages in thread
From: Paolo Abeni @ 2024-10-01  9:57 UTC (permalink / raw)
  To: Eric Dumazet, Willem de Bruijn
  Cc: David S . Miller, Jakub Kicinski, David Ahern, netdev,
	Willem de Bruijn, Jonathan Davies, eric.dumazet

On 9/26/24 11:19, Eric Dumazet wrote:
> On Thu, Sep 26, 2024 at 11:17 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>>
>> Willem de Bruijn wrote:
>>> Eric Dumazet wrote:
>>>> One path takes care of SKB_GSO_DODGY, assuming
>>>> skb->len is bigger than hdr_len.
>>>>
>>>> virtio_net_hdr_to_skb() does not fully dissect TCP headers,
>>>> it only make sure it is at least 20 bytes.
>>>>
>>>> It is possible for an user to provide a malicious 'GSO' packet,
>>>> total length of 80 bytes.
>>>>
>>>> - 20 bytes of IPv4 header
>>>> - 60 bytes TCP header
>>>> - a small gso_size like 8
>>>>
>>>> virtio_net_hdr_to_skb() would declare this packet as a normal
>>>> GSO packet, because it would see 40 bytes of payload,
>>>> bigger than gso_size.
>>>>
>>>> We need to make detect this case to not underflow
>>>> qdisc_skb_cb(skb)->pkt_len.
>>>>
>>>> Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
>>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>>> ---
>>>>   net/core/dev.c | 10 +++++++---
>>>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>> index f2c47da79f17d5ebe6b334b63d66c84c84c519fc..35b8bcfb209bd274c81380eaf6e445641306b018 100644
>>>> --- a/net/core/dev.c
>>>> +++ b/net/core/dev.c
>>>> @@ -3766,10 +3766,14 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
>>>>                              hdr_len += sizeof(struct udphdr);
>>>>              }
>>>>
>>>> -           if (shinfo->gso_type & SKB_GSO_DODGY)
>>>> -                   gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
>>>> -                                           shinfo->gso_size);
>>>> +           if (unlikely(shinfo->gso_type & SKB_GSO_DODGY)) {
>>>> +                   int payload = skb->len - hdr_len;
>>>>
>>>> +                   /* Malicious packet. */
>>>> +                   if (payload <= 0)
>>>> +                           return;
>>>> +                   gso_segs = DIV_ROUND_UP(payload, shinfo->gso_size);
>>>> +           }
>>>
>>> Especially for a malicious packet, should gso_segs be reinitialized to
>>> a sane value? As sane as feasible when other fields cannot be fully
>>> trusted..
>>
>> Never mind. I guess the best thing we can do is to leave pkt_len as
>> skb->len, indeed.
> 
> It is unclear if we can change a field in skb here, I hope that in the
> future we can make virtio_net_hdr_to_skb() safer.

A problem with virtio_net_hdr_to_skb() is that is needs to cope with 
existing implementation bending the virtio spec to its limits, and the 
spec definition allowed for some 'fun'. Sanitize everything is likely to 
break existing users.

On a somewhat related note, I'm trying to push a new virtio-spec GSO 
feature (UDP tunnel support) that will inevitably increase the attack 
surface. On the flip side the general idea is to be as strict as 
possible with the definition of the new allowed gso types, to try to 
avoid the above kind of scenarios (for such gso types).

The above just FYI,

Paolo


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init()
  2024-09-24 15:02 [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() Eric Dumazet
                   ` (2 preceding siblings ...)
  2024-09-26  2:26 ` [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() David Ahern
@ 2024-10-01 10:10 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 20+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-10-01 10:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, kuba, pabeni, dsahern, netdev, willemb, jonathan.davies,
	eric.dumazet

Hello:

This series was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Tue, 24 Sep 2024 15:02:55 +0000 you wrote:
> Inspired by one syzbot report.
> 
> At least one qdisc (fq_codel) depends on qdisc_skb_cb(skb)->pkt_len
> having a sane value (not zero)
> 
> With the help of af_packet, syzbot was able to fool qdisc_pkt_len_init()
> to precisely set qdisc_skb_cb(skb)->pkt_len to zero.
> 
> [...]

Here is the summary with links:
  - [net,1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO
    https://git.kernel.org/netdev/net/c/c20029db2839
  - [net,2/2] net: add more sanity checks to qdisc_pkt_len_init()
    https://git.kernel.org/netdev/net/c/ab9a9a9e9647

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-10-01 10:10 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-24 15:02 [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() Eric Dumazet
2024-09-24 15:02 ` [PATCH net 1/2] net: avoid potential underflow in qdisc_pkt_len_init() with UFO Eric Dumazet
2024-09-26  9:07   ` Willem de Bruijn
2024-09-26 18:00   ` Jonathan Davies
2024-09-24 15:02 ` [PATCH net 2/2] net: add more sanity checks to qdisc_pkt_len_init() Eric Dumazet
2024-09-25 17:51   ` Joe Damato
2024-09-25 18:00     ` Eric Dumazet
2024-09-25 18:24       ` Joe Damato
2024-09-25 18:27         ` Stephen Hemminger
2024-09-25 18:55         ` Eric Dumazet
2024-09-25 19:01           ` Eric Dumazet
2024-09-25 19:15             ` Joe Damato
2024-09-25 19:07           ` Joe Damato
2024-09-25 19:08             ` Eric Dumazet
2024-09-26  9:13   ` Willem de Bruijn
2024-09-26  9:17     ` Willem de Bruijn
2024-09-26  9:19       ` Eric Dumazet
2024-10-01  9:57         ` Paolo Abeni
2024-09-26  2:26 ` [PATCH net 0/2] net: two fixes for qdisc_pkt_len_init() David Ahern
2024-10-01 10:10 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).