public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiayuan Chen <jiayuan.chen@linux.dev>
To: Paolo Abeni <pabeni@redhat.com>, Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org,
	syzbot+83181a31faf9455499c5@syzkaller.appspotmail.com,
	"David S. Miller" <davem@davemloft.net>,
	David Ahern <dsahern@kernel.org>,
	Jakub Kicinski <kuba@kernel.org>, Simon Horman <horms@kernel.org>,
	Pravin B Shelar <pshelar@nicira.com>,
	Tom Herbert <tom@herbertland.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap
Date: Fri, 24 Apr 2026 10:04:22 +0800	[thread overview]
Message-ID: <b4405f36-9492-444d-9419-c4bb907009e6@linux.dev> (raw)
In-Reply-To: <4f467b95-d135-4c1a-9f44-09138ff2d592@redhat.com>


On 4/23/26 4:19 PM, Paolo Abeni wrote:
> On 4/19/26 3:01 PM, Jiayuan Chen wrote:
>> [...]
>>>>    +662,18 @@ static inline int iptunnel_pull_offloads(struct sk_buff *skb)
>>>>           return 0;
>>>>    }
>>>>
>>>> +static inline void iptunnel_rebuild_transport_header(struct sk_buff *skb)
>>>> +{
>>>> +       if (!skb_is_gso(skb))
>>>> +               return;
>>>> +
>>>> +       skb->transport_header = (typeof(skb->transport_header))~0U;
>>>> +       skb_probe_transport_header(skb);
>>>> +
>>>> +       if (!skb_transport_header_was_set(skb))
>>>> +               skb_gso_reset(skb);
>>> I do not think this makes sense.
>>> What is a valid case for this packet being processed further?
>>> The buggy packet must be dropped, instead of being mangled like this.
>> Hi Eric,
>>
>> The reproducer builds a gre frame whose inner Ethernet header is
>> all-zero. Tracing the skb through RX:
>>
>> 1. At GRE decap exit, skb_transport_offset(skb) < 0 is the rule, not the
>> exception.
>>
>> It is negative for every packet leaving the tunnel, including perfectly
>> well-formed inner IPv4 traffic
>> because the tunnel leaves skb->transport_header at the outer L4 offset while
>> pskb_pull() has already advanced skb->data past it.
> Is it? the transport header is an offset on top of skb->head, pskb_pull
> changes head only if the header is not in the linear part (and the
> transport offset is already invalid).


Sorry, my wording was imprecise. The point is not that `transport_header`
itself holds a negative value — it does not — but that after GRE processing,
`skb->data` has advanced past the outer L4 while `skb->transport_header`
is never touched, so `skb_transport_offset(skb)` ends up negative.

>> skb_transport_header_was_set() stays true, so downstream
>> code that trusts that flag now trusts a stale, negative offset.
>>
>> 2. GRO repairs it — but only for protocols it knows.
>>
>> In dev_gro_receive(), skb->protocol is dispatched through the offload
>> table. For ETH_P_IP,
>> inet_gro_receive() calls skb_set_transport_header(skb,
>> skb_gro_offset(skb)), and the offset
>> becomes valid again. But for malformed skb, dev_gro_receive just bypass it.
> So only malformed packets cause trouble, right?
>
The negative offset is produced for every packet leaving GRE, not just
malformed ones. What differs is what happens downstream:

- For well-formed inner IPv4, `inet_gro_receive()` calls
  `skb_set_transport_header(skb, skb_gro_offset(skb))` and restores a
  valid offset before any consumer observes it.
- For malformed inner frames (e.g. `skb->protocol == ETH_P_802_2 or 
other `),
  `dev_gro_receive()` finds no ptype and just passes the skb through.
  The stale negative offset survives into `__netif_receive_skb_core()`.

So the UAF needs both conditions: GRE producing the stale offset *and*

no downstream rescue.


>> 3. Both kinds then reach __netif_receive_skb_core().
>>
>> So the skb that qdisc/tc/BPF segmenters later see has an
>> invariant violation — _was_set == true but offset < 0 — that the core
>> layer has no intention of catching for us.
>>
>> My reading of this is that the tunnel decap path is producing an skb
>> that doesn't
>> honor the contract __netif_receive_skb_core() expects from its
>> producers, and that
>> it doesn't really make sense to ask GRE to parse or validate the inner
>> L4 in order
>> to fix this.
>>
>> I'm thinking at the end of GRE decap, before handing the skb to
>> gro_cells_receive(),
>> call skb_reset_transport_header(skb).
> My take is that you need to address the issue earlier than the current
> patch, dropping the malformed packets.
>
> /P


Dropping at tunnel decap is a reasonable option, e.g.:

   if (unlikely(skb->protocol == htons(ETH_P_802_2) ||
                skb->protocol == htons(ETH_P_802_3) ||
                ....)) {
       kfree_skb_reason(skb, SKB_DROP_REASON_...);
       return 0;
   }

Two concerns about this approach, though:

1.It asks GRE to decide whether an inner L2 frame is "sensible",
  which I don't think should be GRE's responsibility — GRE is a
  generic L2/L3 tunnel and historically stays agnostic about the
  inner payload.

2. More importantly, filtering on ETH_P_802_2 / ETH_P_802_3 only
  covers the case where inner h_proto < ETH_P_802_3_MIN. The same
  stale-offset condition can also be reached with any inner
  ethertype that has no GRO receive callback resetting
  transport_header.

In my earlier reply to Eric I suggested calling
skb_reset_transport_header(skb) at the tunnel decap exit instead.
A few reasons I think this is a cleaner fix:

1.It is inner-protocol agnostic — it normalizes the skb regardless
of what the inner ethertype happens to be, so ARP/PPPoE/... are
fixed by the same one-liner.

2.ip_tunnel_rcv() already updates mac_header (via eth_type_trans)
and network_header (ip_tunnel.c:414). transport_header is the
only one of the three left pointing at the outer offset; resetting
it here is completing what the function is already doing for the
other two.

3.Malformed frames that carry it downstream simply fail ptype_base
dispatch and are dropped there, the same way any unknown-ethertype
frame is dropped today.


      reply	other threads:[~2026-04-24  2:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-19  9:08 [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap Jiayuan Chen
2026-04-19  9:25 ` Eric Dumazet
2026-04-19 13:01   ` Jiayuan Chen
2026-04-23  8:19     ` Paolo Abeni
2026-04-24  2:04       ` Jiayuan Chen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b4405f36-9492-444d-9419-c4bb907009e6@linux.dev \
    --to=jiayuan.chen@linux.dev \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pshelar@nicira.com \
    --cc=syzbot+83181a31faf9455499c5@syzkaller.appspotmail.com \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox