All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] net: clear transport header during tunnel decapsulation
@ 2026-06-24  7:32 Eric Dumazet
  2026-06-24 10:41 ` Jiayuan Chen
  2026-06-24 12:14 ` [syzbot ci] " syzbot ci
  0 siblings, 2 replies; 4+ messages in thread
From: Eric Dumazet @ 2026-06-24  7:32 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Ido Schimmel, David Ahern, netdev, eric.dumazet,
	Eric Dumazet, syzbot+d5d0d598a4cfdfafdc3b

Syzbot triggered a DEBUG_NET_WARN_ON_ONCE(len > INT_MAX) assertion in
pskb_may_pull_reason() called from qdisc_pkt_len_segs_init().

The root cause is a stale, negative transport header offset carried over
during tunnel decapsulation. When a tunnel receiver (e.g., VXLAN or Geneve)
decapsulates a packet, it pulls the outer headers but leaves the transport
header pointing to the outer UDP header. This offset becomes negative
relative to the new skb->data (inner IP header).

If the packet bypasses GRO (e.g., an untrusted GSO packet flagged as
"unexpected GSO" by udp_unexpected_gso() due to missing tunnel GSO bits),
it is flushed directly to the stack as GRO_NORMAL. On ingress, Layer 2 Qdisc
processing (sch_handle_ingress) happens before Layer 3 IP reception
(ip_rcv_core) can run and reset the transport header. Consequently,
qdisc_pkt_len_segs_init() attempts to validate the transport header using
pskb_may_pull(skb, hdr_len + sizeof(tcphdr)). The negative hdr_len overflows
the unsigned cast in pskb_may_pull(), triggering the assertion.

Fix this by clearing the transport header to the ~0U sentinel value during
decapsulation. This ensures that:
1) The ingress Qdisc safely skips validation via !skb_transport_header_was_set()
   and returns early without warning.
2) The IP layer (ip_rcv_core) later correctly resets the transport header
   to the inner L4 header offset.

Introduce skb_unset_transport_header() helper and apply it in the main
decapsulation paths:
1) __iptunnel_pull_header() (covering Geneve, GRE, IPIP, SIT, etc.)
2) vxlan_rcv() (covering VXLAN)

This restores skb invariants at the decapsulation boundary without adding
overhead to the Qdisc fast path.

Fixes: 7fb4c1967011 ("net: pull headers in qdisc_pkt_len_segs_init()")
Reported-by: syzbot+d5d0d598a4cfdfafdc3b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6a3b853b.52ae72c2.136ac7.000c.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Assisted-by: Gemini:gemini-3.1-pro
---
 drivers/net/vxlan/vxlan_core.c | 1 +
 include/linux/skbuff.h         | 5 +++++
 net/ipv4/ip_tunnel_core.c      | 1 +
 3 files changed, 7 insertions(+)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 67c367cc566233e809b0f70e0d939dd1c1ac0d9f..49318ad8164a2f2572fc58c0ed449b68922ae71e 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1799,6 +1799,7 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb)
 
 	dev_dstats_rx_add(vxlan->dev, skb->len);
 	vxlan_vnifilter_count(vxlan, vni, vninode, VXLAN_VNI_STATS_RX, skb->len);
+	skb_unset_transport_header(skb);
 	gro_cells_receive(&vxlan->gro_cells, skb);
 
 	rcu_read_unlock();
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 115db8c44db21383632dd150a17c9ddcc03508e4..e8305a0fd3857ab85da4c2e8322989ed93e88d87 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3084,6 +3084,11 @@ static inline bool skb_transport_header_was_set(const struct sk_buff *skb)
 	return skb->transport_header != (typeof(skb->transport_header))~0U;
 }
 
+static inline void skb_unset_transport_header(struct sk_buff *skb)
+{
+	skb->transport_header = (typeof(skb->transport_header))~0U;
+}
+
 static inline unsigned char *skb_transport_header(const struct sk_buff *skb)
 {
 	DEBUG_NET_WARN_ON_ONCE(!skb_transport_header_was_set(skb));
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index d3c677e9bff2080e4760347a3d873da4e83ac3ca..59192f58da2e3aae19d00505cc3bb04b083b77c5 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -134,6 +134,7 @@ int __iptunnel_pull_header(struct sk_buff *skb, int hdr_len,
 	__vlan_hwaccel_clear_tag(skb);
 	skb_set_queue_mapping(skb, 0);
 	skb_scrub_packet(skb, xnet);
+	skb_unset_transport_header(skb);
 
 	return iptunnel_pull_offloads(skb);
 }
-- 
2.55.0.rc0.799.gd6f94ed593-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] net: clear transport header during tunnel decapsulation
  2026-06-24  7:32 [PATCH net] net: clear transport header during tunnel decapsulation Eric Dumazet
@ 2026-06-24 10:41 ` Jiayuan Chen
  2026-06-24 11:44   ` Eric Dumazet
  2026-06-24 12:14 ` [syzbot ci] " syzbot ci
  1 sibling, 1 reply; 4+ messages in thread
From: Jiayuan Chen @ 2026-06-24 10:41 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Ido Schimmel, David Ahern, netdev, eric.dumazet,
	syzbot+d5d0d598a4cfdfafdc3b


On 6/24/26 3:32 PM, Eric Dumazet wrote:
> Syzbot triggered a DEBUG_NET_WARN_ON_ONCE(len > INT_MAX) assertion in
> pskb_may_pull_reason() called from qdisc_pkt_len_segs_init().
>
> The root cause is a stale, negative transport header offset carried over
> during tunnel decapsulation. When a tunnel receiver (e.g., VXLAN or Geneve)
> decapsulates a packet, it pulls the outer headers but leaves the transport
> header pointing to the outer UDP header. This offset becomes negative
> relative to the new skb->data (inner IP header).
>
> If the packet bypasses GRO (e.g., an untrusted GSO packet flagged as
> "unexpected GSO" by udp_unexpected_gso() due to missing tunnel GSO bits),
> it is flushed directly to the stack as GRO_NORMAL. On ingress, Layer 2 Qdisc
> processing (sch_handle_ingress) happens before Layer 3 IP reception
> (ip_rcv_core) can run and reset the transport header. Consequently,
> qdisc_pkt_len_segs_init() attempts to validate the transport header using
> pskb_may_pull(skb, hdr_len + sizeof(tcphdr)). The negative hdr_len overflows
> the unsigned cast in pskb_may_pull(), triggering the assertion.
>
> Fix this by clearing the transport header to the ~0U sentinel value during
> decapsulation. This ensures that:
> 1) The ingress Qdisc safely skips validation via !skb_transport_header_was_set()
>     and returns early without warning.
> 2) The IP layer (ip_rcv_core) later correctly resets the transport header
>     to the inner L4 header offset.
>
> Introduce skb_unset_transport_header() helper and apply it in the main
> decapsulation paths:
> 1) __iptunnel_pull_header() (covering Geneve, GRE, IPIP, SIT, etc.)
> 2) vxlan_rcv() (covering VXLAN)
>
> This restores skb invariants at the decapsulation boundary without adding
> overhead to the Qdisc fast path.
>
> Fixes: 7fb4c1967011 ("net: pull headers in qdisc_pkt_len_segs_init()")
> Reported-by: syzbot+d5d0d598a4cfdfafdc3b@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/netdev/6a3b853b.52ae72c2.136ac7.000c.GAE@google.com/T/#u
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Assisted-by: Gemini:gemini-3.1-pro


I think a negative skb_transport_offset() should break something else too,
so the Fixes tag looks wrong, but I couldn't find any actual breakage 
(luck, or I'm missing it).

Hope sashiko read this reply and confirm it....



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] net: clear transport header during tunnel decapsulation
  2026-06-24 10:41 ` Jiayuan Chen
@ 2026-06-24 11:44   ` Eric Dumazet
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2026-06-24 11:44 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Ido Schimmel, David Ahern, netdev, eric.dumazet,
	syzbot+d5d0d598a4cfdfafdc3b

On Wed, Jun 24, 2026 at 3:41 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
>
> On 6/24/26 3:32 PM, Eric Dumazet wrote:
> > Syzbot triggered a DEBUG_NET_WARN_ON_ONCE(len > INT_MAX) assertion in
> > pskb_may_pull_reason() called from qdisc_pkt_len_segs_init().
> >
> > The root cause is a stale, negative transport header offset carried over
> > during tunnel decapsulation. When a tunnel receiver (e.g., VXLAN or Geneve)
> > decapsulates a packet, it pulls the outer headers but leaves the transport
> > header pointing to the outer UDP header. This offset becomes negative
> > relative to the new skb->data (inner IP header).
> >
> > If the packet bypasses GRO (e.g., an untrusted GSO packet flagged as
> > "unexpected GSO" by udp_unexpected_gso() due to missing tunnel GSO bits),
> > it is flushed directly to the stack as GRO_NORMAL. On ingress, Layer 2 Qdisc
> > processing (sch_handle_ingress) happens before Layer 3 IP reception
> > (ip_rcv_core) can run and reset the transport header. Consequently,
> > qdisc_pkt_len_segs_init() attempts to validate the transport header using
> > pskb_may_pull(skb, hdr_len + sizeof(tcphdr)). The negative hdr_len overflows
> > the unsigned cast in pskb_may_pull(), triggering the assertion.
> >
> > Fix this by clearing the transport header to the ~0U sentinel value during
> > decapsulation. This ensures that:
> > 1) The ingress Qdisc safely skips validation via !skb_transport_header_was_set()
> >     and returns early without warning.
> > 2) The IP layer (ip_rcv_core) later correctly resets the transport header
> >     to the inner L4 header offset.
> >
> > Introduce skb_unset_transport_header() helper and apply it in the main
> > decapsulation paths:
> > 1) __iptunnel_pull_header() (covering Geneve, GRE, IPIP, SIT, etc.)
> > 2) vxlan_rcv() (covering VXLAN)
> >
> > This restores skb invariants at the decapsulation boundary without adding
> > overhead to the Qdisc fast path.
> >
> > Fixes: 7fb4c1967011 ("net: pull headers in qdisc_pkt_len_segs_init()")
> > Reported-by: syzbot+d5d0d598a4cfdfafdc3b@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/netdev/6a3b853b.52ae72c2.136ac7.000c.GAE@google.com/T/#u
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Assisted-by: Gemini:gemini-3.1-pro
>
>
> I think a negative skb_transport_offset() should break something else too,
> so the Fixes tag looks wrong, but I couldn't find any actual breakage
> (luck, or I'm missing it).

Read again the changelog: transport header is set (in ingress) a bit
later in the stack.

Nothing needs it before, but  qdisc_pkt_len_segs_init() if/when it is
called in ingress.

>
> Hope sashiko read this reply and confirm it....

On older kernels (before  7fb4c1967011 ("net: pull headers in
qdisc_pkt_len_segs_init()"),
the bug is completely latent and harmless.

This prevents unnecessary backporting churn and potential merge conflicts on
very old kernels where skb_unset_transport_header() doesn't exist.

The Historical Option (a6d5bbf34efa / d342894c5d28):

If we point to the original commits that introduced the tunnels,
we are historically accurate, but we risk stable scripts trying to
backport this fix all the way back to 2012/2016
(e.g. kernel 3.7 or 4.6), which is unnecessary and highly likely to
fail to apply.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [syzbot ci] Re: net: clear transport header during tunnel decapsulation
  2026-06-24  7:32 [PATCH net] net: clear transport header during tunnel decapsulation Eric Dumazet
  2026-06-24 10:41 ` Jiayuan Chen
@ 2026-06-24 12:14 ` syzbot ci
  1 sibling, 0 replies; 4+ messages in thread
From: syzbot ci @ 2026-06-24 12:14 UTC (permalink / raw)
  To: davem, dsahern, edumazet, eric.dumazet, horms, idosch, kuba,
	netdev, pabeni, syzbot
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the following series

[v1] net: clear transport header during tunnel decapsulation
https://lore.kernel.org/all/20260624073209.3703492-1-edumazet@google.com
* [PATCH net] net: clear transport header during tunnel decapsulation

and found the following issue:
WARNING in geneve_udp_encap_recv

Full report is available here:
https://ci.syzbot.org/series/1f6dc47e-354f-4904-bc18-c2b7ea4d79b2

***

WARNING in geneve_udp_encap_recv

tree:      net
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net.git
base:      d87363b0edfc7504ff2b144fe4cdd8154f90f42e
arch:      amd64
compiler:  Debian clang version 22.1.6 (++20260514074242+fc4aad7b5db3-1~exp1~20260514074407.73), Debian LLD 22.1.6
config:    https://ci.syzbot.org/builds/7bb83c16-d55d-4d99-8c2b-6050e0022ef6/config

------------[ cut here ]------------
!skb_transport_header_was_set(skb)
WARNING: ./include/linux/skbuff.h:3094 at geneve_udp_encap_recv+0x26ed/0x4130, CPU#1: kworker/1:3/5072
Modules linked in:
CPU: 1 UID: 0 PID: 5072 Comm: kworker/1:3 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: mld mld_ifc_work
RIP: 0010:geneve_udp_encap_recv+0x26ed/0x4130
Code: c8 00 00 00 0f b6 04 01 84 c0 0f 85 f2 18 00 00 48 8b 7c 24 78 66 44 89 6f 0a e8 7e 9b 7a 03 e9 8c 00 00 00 e8 f4 fc 33 fb 90 <0f> 0b 90 e9 2e e3 ff ff 49 83 c6 06 4c 89 f0 48 c1 e8 03 48 b9 00
RSP: 0018:ffffc90000a08620 EFLAGS: 00010246
RAX: ffffffff8691f92c RBX: ffff8881bcc8b5d0 RCX: ffff88816a7abb80
RDX: 0000000000000100 RSI: 000000000000ffff RDI: 000000000000ffff
RBP: ffffc90000a08790 R08: ffffffff903114f7 R09: 1ffffffff206229e
R10: dffffc0000000000 R11: fffffbfff206229f R12: ffff888109f6a108
R13: 1ffff110213ed5df R14: 0000000000000010 R15: dffffc0000000000
FS:  0000000000000000(0000) GS:ffff8882a927b000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6956ea5440 CR3: 000000016f55c000 CR4: 00000000000006f0
Call Trace:
 <IRQ>
 udp_queue_rcv_one_skb+0xfc5/0x10e0
 udp_unicast_rcv_skb+0x21a/0x3a0
 udp_rcv+0xecb/0x1db0
 ip_protocol_deliver_rcu+0x27e/0x440
 ip_local_deliver_finish+0x3bb/0x6f0
 NF_HOOK+0x336/0x3c0
 NF_HOOK+0x336/0x3c0
 process_backlog+0xa34/0x1860
 __napi_poll+0xaa/0x330
 net_rx_action+0x61d/0xf50
 handle_softirqs+0x225/0x840
 do_softirq+0x76/0xd0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0xf8/0x130
 __dev_queue_xmit+0x1ed7/0x37f0
 ip6_output+0x337/0x540
 NF_HOOK+0x177/0x4f0
 mld_sendpack+0x890/0xe10
 mld_ifc_work+0x839/0xe70
 process_scheduled_works+0xa8e/0x14e0
 worker_thread+0xa47/0xfb0
 kthread+0x388/0x470
 ret_from_fork+0x514/0xb70
 ret_from_fork_asm+0x1a/0x30
 </TASK>


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

To test a patch for this bug, please reply with `#syz test`
(should be on a separate line).

The patch should be attached to the email.
Note: arguments like custom git repos and branches are not supported.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-24 12:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24  7:32 [PATCH net] net: clear transport header during tunnel decapsulation Eric Dumazet
2026-06-24 10:41 ` Jiayuan Chen
2026-06-24 11:44   ` Eric Dumazet
2026-06-24 12:14 ` [syzbot ci] " syzbot ci

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.