* [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap
@ 2026-04-19 9:08 Jiayuan Chen
2026-04-19 9:25 ` Eric Dumazet
0 siblings, 1 reply; 3+ messages in thread
From: Jiayuan Chen @ 2026-04-19 9:08 UTC (permalink / raw)
To: netdev
Cc: Jiayuan Chen, syzbot+83181a31faf9455499c5, David S. Miller,
David Ahern, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Pravin B Shelar, Tom Herbert, linux-kernel
syzbot reported a BUG.
I found that after GRE decapsulation in gretap/ip6gretap paths, the
transport_header becomes stale with a negative offset. The sequence is:
1. Before decap, transport_header points to the outer L4 (GRE) header.
2. __iptunnel_pull_header() calls skb_pull_rcsum() to advance skb->data
past the GRE header, but does not update transport_header.
3. For TEB (gretap/ip6gretap), eth_type_trans() in ip_tunnel_rcv() /
__ip6_tnl_rcv() further pulls ETH_HLEN (14 bytes) from skb->data.
After these two pulls, skb->data has moved forward while transport_header
still points to the old (now behind skb->data) position, resulting in a
negative skb_transport_offset(): typically -4 after GRE pull alone, or
-18 after GRE + inner Ethernet pull.
In the normal case where the inner frame is a recognizable protocol
(e.g., IPv4/TCP), the transport_header is subsequently overwritten by
ip_rcv_core() (or inet_gro_receive() on the GRO path) via
skb_set_transport_header(), and the stale value never reaches downstream
consumers. However, if the inner frame cannot be parsed (e.g.,
eth_type_trans() classifies it as ETH_P_802_2 due to a zero/invalid
inner Ethernet header), neither rescue runs, and the stale offset
persists into __netif_receive_skb_core().
When this stale offset is combined with contradictory GSO metadata (e.g.,
SKB_GSO_TCPV4 injected via virtio_net_hdr from a tun device),
qdisc_pkt_len_segs_init() trusts the negative offset: the unsigned
wraparound makes pskb_may_pull() effectively a no-op, and __tcp_hdrlen()
then reads from an invalid memory location, causing a use-after-free.
The UAF only triggers on the GSO path, where qdisc_pkt_len_segs_init()
dereferences the transport header to compute per-segment length. Fix
this by introducing iptunnel_rebuild_transport_header(), which is a
no-op for non-GSO packets and otherwise re-probes the transport header
via the flow dissector. If re-probing fails, the contradictory GSO
metadata is cleared via skb_gso_reset() so downstream consumers cannot
trust stale offsets. Restricting the rebuild to GSO packets keeps the
flow-dissector cost off the common rx fast path.
reproducer: https://gist.github.com/mrpre/5ba943fd86367af748b70de99263da4b
Link: https://syzkaller.appspot.com/bug?extid=83181a31faf9455499c5
Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
Fixes: 0d3c703a9d17 ("ipv6: Cleanup IPv6 tunnel receive path")
Reported-by: syzbot+83181a31faf9455499c5@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69de2bee.a00a0220.475f0.0041.GAE@google.com/T/
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
As a follow-up for production reliability, I am wondering whether we
can extend the existing safety net in __netif_receive_skb_core() to
also handle set-but-negative transport_header:
if (!skb_transport_header_was_set(skb) ||
skb_transport_offset(skb) < 0)
skb_reset_transport_header(skb);
---
include/net/ip_tunnels.h | 12 ++++++++++++
net/ipv4/ip_tunnel.c | 2 ++
net/ipv6/ip6_tunnel.c | 2 ++
3 files changed, 16 insertions(+)
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index d708b66e55cd..9b4e662833a1 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -662,6 +662,18 @@ static inline int iptunnel_pull_offloads(struct sk_buff *skb)
return 0;
}
+static inline void iptunnel_rebuild_transport_header(struct sk_buff *skb)
+{
+ if (!skb_is_gso(skb))
+ return;
+
+ skb->transport_header = (typeof(skb->transport_header))~0U;
+ skb_probe_transport_header(skb);
+
+ if (!skb_transport_header_was_set(skb))
+ skb_gso_reset(skb);
+}
+
static inline void iptunnel_xmit_stats(struct net_device *dev, int pkt_len)
{
if (pkt_len > 0) {
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 50d0f5fe4e4c..c46be68cfafa 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -445,6 +445,8 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb,
if (tun_dst)
skb_dst_set(skb, (struct dst_entry *)tun_dst);
+ iptunnel_rebuild_transport_header(skb);
+
gro_cells_receive(&tunnel->gro_cells, skb);
return 0;
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 46bc06506470..f95348cf3c77 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -879,6 +879,8 @@ static int __ip6_tnl_rcv(struct ip6_tnl *tunnel, struct sk_buff *skb,
if (tun_dst)
skb_dst_set(skb, (struct dst_entry *)tun_dst);
+ iptunnel_rebuild_transport_header(skb);
+
gro_cells_receive(&tunnel->gro_cells, skb);
return 0;
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap
2026-04-19 9:08 [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap Jiayuan Chen
@ 2026-04-19 9:25 ` Eric Dumazet
2026-04-19 13:01 ` Jiayuan Chen
0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2026-04-19 9:25 UTC (permalink / raw)
To: Jiayuan Chen
Cc: netdev, syzbot+83181a31faf9455499c5, David S. Miller, David Ahern,
Jakub Kicinski, Paolo Abeni, Simon Horman, Pravin B Shelar,
Tom Herbert, linux-kernel
On Sun, Apr 19, 2026 at 2:08 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> syzbot reported a BUG.
>
> I found that after GRE decapsulation in gretap/ip6gretap paths, the
> transport_header becomes stale with a negative offset. The sequence is:
>
> 1. Before decap, transport_header points to the outer L4 (GRE) header.
> 2. __iptunnel_pull_header() calls skb_pull_rcsum() to advance skb->data
> past the GRE header, but does not update transport_header.
> 3. For TEB (gretap/ip6gretap), eth_type_trans() in ip_tunnel_rcv() /
> __ip6_tnl_rcv() further pulls ETH_HLEN (14 bytes) from skb->data.
>
> After these two pulls, skb->data has moved forward while transport_header
> still points to the old (now behind skb->data) position, resulting in a
> negative skb_transport_offset(): typically -4 after GRE pull alone, or
> -18 after GRE + inner Ethernet pull.
>
> In the normal case where the inner frame is a recognizable protocol
> (e.g., IPv4/TCP), the transport_header is subsequently overwritten by
> ip_rcv_core() (or inet_gro_receive() on the GRO path) via
> skb_set_transport_header(), and the stale value never reaches downstream
> consumers. However, if the inner frame cannot be parsed (e.g.,
> eth_type_trans() classifies it as ETH_P_802_2 due to a zero/invalid
> inner Ethernet header), neither rescue runs, and the stale offset
> persists into __netif_receive_skb_core().
>
> When this stale offset is combined with contradictory GSO metadata (e.g.,
> SKB_GSO_TCPV4 injected via virtio_net_hdr from a tun device),
> qdisc_pkt_len_segs_init() trusts the negative offset: the unsigned
> wraparound makes pskb_may_pull() effectively a no-op, and __tcp_hdrlen()
> then reads from an invalid memory location, causing a use-after-free.
>
> The UAF only triggers on the GSO path, where qdisc_pkt_len_segs_init()
> dereferences the transport header to compute per-segment length. Fix
> this by introducing iptunnel_rebuild_transport_header(), which is a
> no-op for non-GSO packets and otherwise re-probes the transport header
> via the flow dissector. If re-probing fails, the contradictory GSO
> metadata is cleared via skb_gso_reset() so downstream consumers cannot
> trust stale offsets. Restricting the rebuild to GSO packets keeps the
> flow-dissector cost off the common rx fast path.
>
> reproducer: https://gist.github.com/mrpre/5ba943fd86367af748b70de99263da4b
>
> Link: https://syzkaller.appspot.com/bug?extid=83181a31faf9455499c5
> Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
> Fixes: 0d3c703a9d17 ("ipv6: Cleanup IPv6 tunnel receive path")
> Reported-by: syzbot+83181a31faf9455499c5@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/69de2bee.a00a0220.475f0.0041.GAE@google.com/T/
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>
> As a follow-up for production reliability, I am wondering whether we
> can extend the existing safety net in __netif_receive_skb_core() to
> also handle set-but-negative transport_header:
>
> if (!skb_transport_header_was_set(skb) ||
> skb_transport_offset(skb) < 0)
> skb_reset_transport_header(skb);
> ---
> include/net/ip_tunnels.h | 12 ++++++++++++
> net/ipv4/ip_tunnel.c | 2 ++
> net/ipv6/ip6_tunnel.c | 2 ++
> 3 files changed, 16 insertions(+)
>
> diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
> index d708b66e55cd..9b4e662833a1 100644
> --- a/include/net/ip_tunnels.h
> +++ b/include/net/ip_tunnels.h
> @@ -662,6 +662,18 @@ static inline int iptunnel_pull_offloads(struct sk_buff *skb)
> return 0;
> }
>
> +static inline void iptunnel_rebuild_transport_header(struct sk_buff *skb)
> +{
> + if (!skb_is_gso(skb))
> + return;
> +
> + skb->transport_header = (typeof(skb->transport_header))~0U;
> + skb_probe_transport_header(skb);
> +
> + if (!skb_transport_header_was_set(skb))
> + skb_gso_reset(skb);
I do not think this makes sense.
What is a valid case for this packet being processed further?
The buggy packet must be dropped, instead of being mangled like this.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap
2026-04-19 9:25 ` Eric Dumazet
@ 2026-04-19 13:01 ` Jiayuan Chen
0 siblings, 0 replies; 3+ messages in thread
From: Jiayuan Chen @ 2026-04-19 13:01 UTC (permalink / raw)
To: Eric Dumazet, Jiayuan Chen
Cc: netdev, syzbot+83181a31faf9455499c5, David S. Miller, David Ahern,
Jakub Kicinski, Paolo Abeni, Simon Horman, Pravin B Shelar,
Tom Herbert, linux-kernel
[...]
>> +662,18 @@ static inline int iptunnel_pull_offloads(struct sk_buff *skb)
>> return 0;
>> }
>>
>> +static inline void iptunnel_rebuild_transport_header(struct sk_buff *skb)
>> +{
>> + if (!skb_is_gso(skb))
>> + return;
>> +
>> + skb->transport_header = (typeof(skb->transport_header))~0U;
>> + skb_probe_transport_header(skb);
>> +
>> + if (!skb_transport_header_was_set(skb))
>> + skb_gso_reset(skb);
> I do not think this makes sense.
> What is a valid case for this packet being processed further?
> The buggy packet must be dropped, instead of being mangled like this.
Hi Eric,
The reproducer builds a gre frame whose inner Ethernet header is
all-zero. Tracing the skb through RX:
1. At GRE decap exit, skb_transport_offset(skb) < 0 is the rule, not the
exception.
It is negative for every packet leaving the tunnel, including perfectly
well-formed inner IPv4 traffic
because the tunnel leaves skb->transport_header at the outer L4 offset while
pskb_pull() has already advanced skb->data past it.
skb_transport_header_was_set() stays true, so downstream
code that trusts that flag now trusts a stale, negative offset.
2. GRO repairs it — but only for protocols it knows.
In dev_gro_receive(), skb->protocol is dispatched through the offload
table. For ETH_P_IP,
inet_gro_receive() calls skb_set_transport_header(skb,
skb_gro_offset(skb)), and the offset
becomes valid again. But for malformed skb, dev_gro_receive just bypass it.
3. Both kinds then reach __netif_receive_skb_core().
So the skb that qdisc/tc/BPF segmenters later see has an
invariant violation — _was_set == true but offset < 0 — that the core
layer has no intention of catching for us.
My reading of this is that the tunnel decap path is producing an skb
that doesn't
honor the contract __netif_receive_skb_core() expects from its
producers, and that
it doesn't really make sense to ask GRE to parse or validate the inner
L4 in order
to fix this.
I'm thinking at the end of GRE decap, before handing the skb to
gro_cells_receive(),
call skb_reset_transport_header(skb).
Thanks,
Jiayuan
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-19 13:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-19 9:08 [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap Jiayuan Chen
2026-04-19 9:25 ` Eric Dumazet
2026-04-19 13:01 ` Jiayuan Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox