* [PATCH net] net: clear transport header during tunnel decapsulation
@ 2026-06-24 7:32 Eric Dumazet
2026-06-24 10:41 ` Jiayuan Chen
2026-06-24 12:14 ` [syzbot ci] " syzbot ci
0 siblings, 2 replies; 4+ messages in thread
From: Eric Dumazet @ 2026-06-24 7:32 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Ido Schimmel, David Ahern, netdev, eric.dumazet,
Eric Dumazet, syzbot+d5d0d598a4cfdfafdc3b
Syzbot triggered a DEBUG_NET_WARN_ON_ONCE(len > INT_MAX) assertion in
pskb_may_pull_reason() called from qdisc_pkt_len_segs_init().
The root cause is a stale, negative transport header offset carried over
during tunnel decapsulation. When a tunnel receiver (e.g., VXLAN or Geneve)
decapsulates a packet, it pulls the outer headers but leaves the transport
header pointing to the outer UDP header. This offset becomes negative
relative to the new skb->data (inner IP header).
If the packet bypasses GRO (e.g., an untrusted GSO packet flagged as
"unexpected GSO" by udp_unexpected_gso() due to missing tunnel GSO bits),
it is flushed directly to the stack as GRO_NORMAL. On ingress, Layer 2 Qdisc
processing (sch_handle_ingress) happens before Layer 3 IP reception
(ip_rcv_core) can run and reset the transport header. Consequently,
qdisc_pkt_len_segs_init() attempts to validate the transport header using
pskb_may_pull(skb, hdr_len + sizeof(tcphdr)). The negative hdr_len overflows
the unsigned cast in pskb_may_pull(), triggering the assertion.
Fix this by clearing the transport header to the ~0U sentinel value during
decapsulation. This ensures that:
1) The ingress Qdisc safely skips validation via !skb_transport_header_was_set()
and returns early without warning.
2) The IP layer (ip_rcv_core) later correctly resets the transport header
to the inner L4 header offset.
Introduce skb_unset_transport_header() helper and apply it in the main
decapsulation paths:
1) __iptunnel_pull_header() (covering Geneve, GRE, IPIP, SIT, etc.)
2) vxlan_rcv() (covering VXLAN)
This restores skb invariants at the decapsulation boundary without adding
overhead to the Qdisc fast path.
Fixes: 7fb4c1967011 ("net: pull headers in qdisc_pkt_len_segs_init()")
Reported-by: syzbot+d5d0d598a4cfdfafdc3b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6a3b853b.52ae72c2.136ac7.000c.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Assisted-by: Gemini:gemini-3.1-pro
---
drivers/net/vxlan/vxlan_core.c | 1 +
include/linux/skbuff.h | 5 +++++
net/ipv4/ip_tunnel_core.c | 1 +
3 files changed, 7 insertions(+)
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 67c367cc566233e809b0f70e0d939dd1c1ac0d9f..49318ad8164a2f2572fc58c0ed449b68922ae71e 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1799,6 +1799,7 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb)
dev_dstats_rx_add(vxlan->dev, skb->len);
vxlan_vnifilter_count(vxlan, vni, vninode, VXLAN_VNI_STATS_RX, skb->len);
+ skb_unset_transport_header(skb);
gro_cells_receive(&vxlan->gro_cells, skb);
rcu_read_unlock();
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 115db8c44db21383632dd150a17c9ddcc03508e4..e8305a0fd3857ab85da4c2e8322989ed93e88d87 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3084,6 +3084,11 @@ static inline bool skb_transport_header_was_set(const struct sk_buff *skb)
return skb->transport_header != (typeof(skb->transport_header))~0U;
}
+static inline void skb_unset_transport_header(struct sk_buff *skb)
+{
+ skb->transport_header = (typeof(skb->transport_header))~0U;
+}
+
static inline unsigned char *skb_transport_header(const struct sk_buff *skb)
{
DEBUG_NET_WARN_ON_ONCE(!skb_transport_header_was_set(skb));
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index d3c677e9bff2080e4760347a3d873da4e83ac3ca..59192f58da2e3aae19d00505cc3bb04b083b77c5 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -134,6 +134,7 @@ int __iptunnel_pull_header(struct sk_buff *skb, int hdr_len,
__vlan_hwaccel_clear_tag(skb);
skb_set_queue_mapping(skb, 0);
skb_scrub_packet(skb, xnet);
+ skb_unset_transport_header(skb);
return iptunnel_pull_offloads(skb);
}
--
2.55.0.rc0.799.gd6f94ed593-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH net] net: clear transport header during tunnel decapsulation
2026-06-24 7:32 [PATCH net] net: clear transport header during tunnel decapsulation Eric Dumazet
@ 2026-06-24 10:41 ` Jiayuan Chen
2026-06-24 11:44 ` Eric Dumazet
2026-06-24 12:14 ` [syzbot ci] " syzbot ci
1 sibling, 1 reply; 4+ messages in thread
From: Jiayuan Chen @ 2026-06-24 10:41 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Ido Schimmel, David Ahern, netdev, eric.dumazet,
syzbot+d5d0d598a4cfdfafdc3b
On 6/24/26 3:32 PM, Eric Dumazet wrote:
> Syzbot triggered a DEBUG_NET_WARN_ON_ONCE(len > INT_MAX) assertion in
> pskb_may_pull_reason() called from qdisc_pkt_len_segs_init().
>
> The root cause is a stale, negative transport header offset carried over
> during tunnel decapsulation. When a tunnel receiver (e.g., VXLAN or Geneve)
> decapsulates a packet, it pulls the outer headers but leaves the transport
> header pointing to the outer UDP header. This offset becomes negative
> relative to the new skb->data (inner IP header).
>
> If the packet bypasses GRO (e.g., an untrusted GSO packet flagged as
> "unexpected GSO" by udp_unexpected_gso() due to missing tunnel GSO bits),
> it is flushed directly to the stack as GRO_NORMAL. On ingress, Layer 2 Qdisc
> processing (sch_handle_ingress) happens before Layer 3 IP reception
> (ip_rcv_core) can run and reset the transport header. Consequently,
> qdisc_pkt_len_segs_init() attempts to validate the transport header using
> pskb_may_pull(skb, hdr_len + sizeof(tcphdr)). The negative hdr_len overflows
> the unsigned cast in pskb_may_pull(), triggering the assertion.
>
> Fix this by clearing the transport header to the ~0U sentinel value during
> decapsulation. This ensures that:
> 1) The ingress Qdisc safely skips validation via !skb_transport_header_was_set()
> and returns early without warning.
> 2) The IP layer (ip_rcv_core) later correctly resets the transport header
> to the inner L4 header offset.
>
> Introduce skb_unset_transport_header() helper and apply it in the main
> decapsulation paths:
> 1) __iptunnel_pull_header() (covering Geneve, GRE, IPIP, SIT, etc.)
> 2) vxlan_rcv() (covering VXLAN)
>
> This restores skb invariants at the decapsulation boundary without adding
> overhead to the Qdisc fast path.
>
> Fixes: 7fb4c1967011 ("net: pull headers in qdisc_pkt_len_segs_init()")
> Reported-by: syzbot+d5d0d598a4cfdfafdc3b@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/netdev/6a3b853b.52ae72c2.136ac7.000c.GAE@google.com/T/#u
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Assisted-by: Gemini:gemini-3.1-pro
I think a negative skb_transport_offset() should break something else too,
so the Fixes tag looks wrong, but I couldn't find any actual breakage
(luck, or I'm missing it).
Hope sashiko read this reply and confirm it....
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH net] net: clear transport header during tunnel decapsulation
2026-06-24 10:41 ` Jiayuan Chen
@ 2026-06-24 11:44 ` Eric Dumazet
0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2026-06-24 11:44 UTC (permalink / raw)
To: Jiayuan Chen
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
Ido Schimmel, David Ahern, netdev, eric.dumazet,
syzbot+d5d0d598a4cfdfafdc3b
On Wed, Jun 24, 2026 at 3:41 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
>
> On 6/24/26 3:32 PM, Eric Dumazet wrote:
> > Syzbot triggered a DEBUG_NET_WARN_ON_ONCE(len > INT_MAX) assertion in
> > pskb_may_pull_reason() called from qdisc_pkt_len_segs_init().
> >
> > The root cause is a stale, negative transport header offset carried over
> > during tunnel decapsulation. When a tunnel receiver (e.g., VXLAN or Geneve)
> > decapsulates a packet, it pulls the outer headers but leaves the transport
> > header pointing to the outer UDP header. This offset becomes negative
> > relative to the new skb->data (inner IP header).
> >
> > If the packet bypasses GRO (e.g., an untrusted GSO packet flagged as
> > "unexpected GSO" by udp_unexpected_gso() due to missing tunnel GSO bits),
> > it is flushed directly to the stack as GRO_NORMAL. On ingress, Layer 2 Qdisc
> > processing (sch_handle_ingress) happens before Layer 3 IP reception
> > (ip_rcv_core) can run and reset the transport header. Consequently,
> > qdisc_pkt_len_segs_init() attempts to validate the transport header using
> > pskb_may_pull(skb, hdr_len + sizeof(tcphdr)). The negative hdr_len overflows
> > the unsigned cast in pskb_may_pull(), triggering the assertion.
> >
> > Fix this by clearing the transport header to the ~0U sentinel value during
> > decapsulation. This ensures that:
> > 1) The ingress Qdisc safely skips validation via !skb_transport_header_was_set()
> > and returns early without warning.
> > 2) The IP layer (ip_rcv_core) later correctly resets the transport header
> > to the inner L4 header offset.
> >
> > Introduce skb_unset_transport_header() helper and apply it in the main
> > decapsulation paths:
> > 1) __iptunnel_pull_header() (covering Geneve, GRE, IPIP, SIT, etc.)
> > 2) vxlan_rcv() (covering VXLAN)
> >
> > This restores skb invariants at the decapsulation boundary without adding
> > overhead to the Qdisc fast path.
> >
> > Fixes: 7fb4c1967011 ("net: pull headers in qdisc_pkt_len_segs_init()")
> > Reported-by: syzbot+d5d0d598a4cfdfafdc3b@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/netdev/6a3b853b.52ae72c2.136ac7.000c.GAE@google.com/T/#u
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Assisted-by: Gemini:gemini-3.1-pro
>
>
> I think a negative skb_transport_offset() should break something else too,
> so the Fixes tag looks wrong, but I couldn't find any actual breakage
> (luck, or I'm missing it).
Read again the changelog: transport header is set (in ingress) a bit
later in the stack.
Nothing needs it before, but qdisc_pkt_len_segs_init() if/when it is
called in ingress.
>
> Hope sashiko read this reply and confirm it....
On older kernels (before 7fb4c1967011 ("net: pull headers in
qdisc_pkt_len_segs_init()"),
the bug is completely latent and harmless.
This prevents unnecessary backporting churn and potential merge conflicts on
very old kernels where skb_unset_transport_header() doesn't exist.
The Historical Option (a6d5bbf34efa / d342894c5d28):
If we point to the original commits that introduced the tunnels,
we are historically accurate, but we risk stable scripts trying to
backport this fix all the way back to 2012/2016
(e.g. kernel 3.7 or 4.6), which is unnecessary and highly likely to
fail to apply.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [syzbot ci] Re: net: clear transport header during tunnel decapsulation
2026-06-24 7:32 [PATCH net] net: clear transport header during tunnel decapsulation Eric Dumazet
2026-06-24 10:41 ` Jiayuan Chen
@ 2026-06-24 12:14 ` syzbot ci
1 sibling, 0 replies; 4+ messages in thread
From: syzbot ci @ 2026-06-24 12:14 UTC (permalink / raw)
To: davem, dsahern, edumazet, eric.dumazet, horms, idosch, kuba,
netdev, pabeni, syzbot
Cc: syzbot, syzkaller-bugs
syzbot ci has tested the following series
[v1] net: clear transport header during tunnel decapsulation
https://lore.kernel.org/all/20260624073209.3703492-1-edumazet@google.com
* [PATCH net] net: clear transport header during tunnel decapsulation
and found the following issue:
WARNING in geneve_udp_encap_recv
Full report is available here:
https://ci.syzbot.org/series/1f6dc47e-354f-4904-bc18-c2b7ea4d79b2
***
WARNING in geneve_udp_encap_recv
tree: net
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net.git
base: d87363b0edfc7504ff2b144fe4cdd8154f90f42e
arch: amd64
compiler: Debian clang version 22.1.6 (++20260514074242+fc4aad7b5db3-1~exp1~20260514074407.73), Debian LLD 22.1.6
config: https://ci.syzbot.org/builds/7bb83c16-d55d-4d99-8c2b-6050e0022ef6/config
------------[ cut here ]------------
!skb_transport_header_was_set(skb)
WARNING: ./include/linux/skbuff.h:3094 at geneve_udp_encap_recv+0x26ed/0x4130, CPU#1: kworker/1:3/5072
Modules linked in:
CPU: 1 UID: 0 PID: 5072 Comm: kworker/1:3 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: mld mld_ifc_work
RIP: 0010:geneve_udp_encap_recv+0x26ed/0x4130
Code: c8 00 00 00 0f b6 04 01 84 c0 0f 85 f2 18 00 00 48 8b 7c 24 78 66 44 89 6f 0a e8 7e 9b 7a 03 e9 8c 00 00 00 e8 f4 fc 33 fb 90 <0f> 0b 90 e9 2e e3 ff ff 49 83 c6 06 4c 89 f0 48 c1 e8 03 48 b9 00
RSP: 0018:ffffc90000a08620 EFLAGS: 00010246
RAX: ffffffff8691f92c RBX: ffff8881bcc8b5d0 RCX: ffff88816a7abb80
RDX: 0000000000000100 RSI: 000000000000ffff RDI: 000000000000ffff
RBP: ffffc90000a08790 R08: ffffffff903114f7 R09: 1ffffffff206229e
R10: dffffc0000000000 R11: fffffbfff206229f R12: ffff888109f6a108
R13: 1ffff110213ed5df R14: 0000000000000010 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ffff8882a927b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6956ea5440 CR3: 000000016f55c000 CR4: 00000000000006f0
Call Trace:
<IRQ>
udp_queue_rcv_one_skb+0xfc5/0x10e0
udp_unicast_rcv_skb+0x21a/0x3a0
udp_rcv+0xecb/0x1db0
ip_protocol_deliver_rcu+0x27e/0x440
ip_local_deliver_finish+0x3bb/0x6f0
NF_HOOK+0x336/0x3c0
NF_HOOK+0x336/0x3c0
process_backlog+0xa34/0x1860
__napi_poll+0xaa/0x330
net_rx_action+0x61d/0xf50
handle_softirqs+0x225/0x840
do_softirq+0x76/0xd0
</IRQ>
<TASK>
__local_bh_enable_ip+0xf8/0x130
__dev_queue_xmit+0x1ed7/0x37f0
ip6_output+0x337/0x540
NF_HOOK+0x177/0x4f0
mld_sendpack+0x890/0xe10
mld_ifc_work+0x839/0xe70
process_scheduled_works+0xa8e/0x14e0
worker_thread+0xa47/0xfb0
kthread+0x388/0x470
ret_from_fork+0x514/0xb70
ret_from_fork_asm+0x1a/0x30
</TASK>
***
If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syzbot@syzkaller.appspotmail.com
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
To test a patch for this bug, please reply with `#syz test`
(should be on a separate line).
The patch should be attached to the email.
Note: arguments like custom git repos and branches are not supported.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-24 12:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24 7:32 [PATCH net] net: clear transport header during tunnel decapsulation Eric Dumazet
2026-06-24 10:41 ` Jiayuan Chen
2026-06-24 11:44 ` Eric Dumazet
2026-06-24 12:14 ` [syzbot ci] " syzbot ci
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox