From: Jiayuan Chen <jiayuan.chen@linux.dev>
To: Paolo Abeni <pabeni@redhat.com>, Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org,
syzbot+83181a31faf9455499c5@syzkaller.appspotmail.com,
"David S. Miller" <davem@davemloft.net>,
David Ahern <dsahern@kernel.org>,
Jakub Kicinski <kuba@kernel.org>, Simon Horman <horms@kernel.org>,
Pravin B Shelar <pshelar@nicira.com>,
Tom Herbert <tom@herbertland.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap
Date: Fri, 24 Apr 2026 10:04:22 +0800 [thread overview]
Message-ID: <b4405f36-9492-444d-9419-c4bb907009e6@linux.dev> (raw)
In-Reply-To: <4f467b95-d135-4c1a-9f44-09138ff2d592@redhat.com>
On 4/23/26 4:19 PM, Paolo Abeni wrote:
> On 4/19/26 3:01 PM, Jiayuan Chen wrote:
>> [...]
>>>> +662,18 @@ static inline int iptunnel_pull_offloads(struct sk_buff *skb)
>>>> return 0;
>>>> }
>>>>
>>>> +static inline void iptunnel_rebuild_transport_header(struct sk_buff *skb)
>>>> +{
>>>> + if (!skb_is_gso(skb))
>>>> + return;
>>>> +
>>>> + skb->transport_header = (typeof(skb->transport_header))~0U;
>>>> + skb_probe_transport_header(skb);
>>>> +
>>>> + if (!skb_transport_header_was_set(skb))
>>>> + skb_gso_reset(skb);
>>> I do not think this makes sense.
>>> What is a valid case for this packet being processed further?
>>> The buggy packet must be dropped, instead of being mangled like this.
>> Hi Eric,
>>
>> The reproducer builds a gre frame whose inner Ethernet header is
>> all-zero. Tracing the skb through RX:
>>
>> 1. At GRE decap exit, skb_transport_offset(skb) < 0 is the rule, not the
>> exception.
>>
>> It is negative for every packet leaving the tunnel, including perfectly
>> well-formed inner IPv4 traffic
>> because the tunnel leaves skb->transport_header at the outer L4 offset while
>> pskb_pull() has already advanced skb->data past it.
> Is it? the transport header is an offset on top of skb->head, pskb_pull
> changes head only if the header is not in the linear part (and the
> transport offset is already invalid).
Sorry, my wording was imprecise. The point is not that `transport_header`
itself holds a negative value — it does not — but that after GRE processing,
`skb->data` has advanced past the outer L4 while `skb->transport_header`
is never touched, so `skb_transport_offset(skb)` ends up negative.
>> skb_transport_header_was_set() stays true, so downstream
>> code that trusts that flag now trusts a stale, negative offset.
>>
>> 2. GRO repairs it — but only for protocols it knows.
>>
>> In dev_gro_receive(), skb->protocol is dispatched through the offload
>> table. For ETH_P_IP,
>> inet_gro_receive() calls skb_set_transport_header(skb,
>> skb_gro_offset(skb)), and the offset
>> becomes valid again. But for malformed skb, dev_gro_receive just bypass it.
> So only malformed packets cause trouble, right?
>
The negative offset is produced for every packet leaving GRE, not just
malformed ones. What differs is what happens downstream:
- For well-formed inner IPv4, `inet_gro_receive()` calls
`skb_set_transport_header(skb, skb_gro_offset(skb))` and restores a
valid offset before any consumer observes it.
- For malformed inner frames (e.g. `skb->protocol == ETH_P_802_2 or
other `),
`dev_gro_receive()` finds no ptype and just passes the skb through.
The stale negative offset survives into `__netif_receive_skb_core()`.
So the UAF needs both conditions: GRE producing the stale offset *and*
no downstream rescue.
>> 3. Both kinds then reach __netif_receive_skb_core().
>>
>> So the skb that qdisc/tc/BPF segmenters later see has an
>> invariant violation — _was_set == true but offset < 0 — that the core
>> layer has no intention of catching for us.
>>
>> My reading of this is that the tunnel decap path is producing an skb
>> that doesn't
>> honor the contract __netif_receive_skb_core() expects from its
>> producers, and that
>> it doesn't really make sense to ask GRE to parse or validate the inner
>> L4 in order
>> to fix this.
>>
>> I'm thinking at the end of GRE decap, before handing the skb to
>> gro_cells_receive(),
>> call skb_reset_transport_header(skb).
> My take is that you need to address the issue earlier than the current
> patch, dropping the malformed packets.
>
> /P
Dropping at tunnel decap is a reasonable option, e.g.:
if (unlikely(skb->protocol == htons(ETH_P_802_2) ||
skb->protocol == htons(ETH_P_802_3) ||
....)) {
kfree_skb_reason(skb, SKB_DROP_REASON_...);
return 0;
}
Two concerns about this approach, though:
1.It asks GRE to decide whether an inner L2 frame is "sensible",
which I don't think should be GRE's responsibility — GRE is a
generic L2/L3 tunnel and historically stays agnostic about the
inner payload.
2. More importantly, filtering on ETH_P_802_2 / ETH_P_802_3 only
covers the case where inner h_proto < ETH_P_802_3_MIN. The same
stale-offset condition can also be reached with any inner
ethertype that has no GRO receive callback resetting
transport_header.
In my earlier reply to Eric I suggested calling
skb_reset_transport_header(skb) at the tunnel decap exit instead.
A few reasons I think this is a cleaner fix:
1.It is inner-protocol agnostic — it normalizes the skb regardless
of what the inner ethertype happens to be, so ARP/PPPoE/... are
fixed by the same one-liner.
2.ip_tunnel_rcv() already updates mac_header (via eth_type_trans)
and network_header (ip_tunnel.c:414). transport_header is the
only one of the three left pointing at the outer offset; resetting
it here is completing what the function is already doing for the
other two.
3.Malformed frames that carry it downstream simply fail ptype_base
dispatch and are dropped there, the same way any unknown-ethertype
frame is dropped today.
prev parent reply other threads:[~2026-04-24 2:04 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-19 9:08 [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap Jiayuan Chen
2026-04-19 9:25 ` Eric Dumazet
2026-04-19 13:01 ` Jiayuan Chen
2026-04-23 8:19 ` Paolo Abeni
2026-04-24 2:04 ` Jiayuan Chen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b4405f36-9492-444d-9419-c4bb907009e6@linux.dev \
--to=jiayuan.chen@linux.dev \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pshelar@nicira.com \
--cc=syzbot+83181a31faf9455499c5@syzkaller.appspotmail.com \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox