Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] Subject: [PATCH] net: gro: fix double aggregation of flush-marked skbs
@ 2026-06-30  2:35 Shiming Cheng
  2026-07-02 10:02 ` Paolo Abeni
  0 siblings, 1 reply; 3+ messages in thread
From: Shiming Cheng @ 2026-06-30  2:35 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, horms, matthias.bgg,
	angelogioacchino.delregno, willemb, daniel.zahka, alice, sd,
	eilaimemedsnaimel, imv4bel, nbd, dsahern, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek
  Cc: stable, lena.wang, shiming.cheng

The new skb_gro_receive_list() function is missing a critical safety check
present in the legacy skb_gro_receive() path. Specifically, it does not
validate NAPI_GRO_CB(skb)->flush before allowing packet aggregation.

This allows already-GRO'd packets with existing frag_list to be
re-aggregated into a new GRO session, corrupting the frag_list chain
structure. When skb_segment() attempts to unpack these malformed packets,
it encounters invalid state and triggers a kernel panic.

Scenario (Tethering/Device forwarding):
  1. Driver: Generated aggregated packet P1 via LRO with frag_list
  2. Dev A: Receives aggregated fraglist packet and flush flag set
  3. Dev A: Re-enters GRO, skb_gro_receive_list() is called
  4. Missing flush check allows re-aggregation despite flush flag
  5. Frag_list chain becomes corrupted (loops or dangling refs)
  6. Dev B: TX path calls skb_segment(), crashes on corrupted frag_list

Root cause in skb_segment():
  The check at line ~4891:
    if (hsize <= 0 && i >= nfrags && skb_headlen(list_skb) &&
        (skb_headlen(list_skb) == len || sg)) {

  When frag_list is corrupted by double aggregation, when list_skb is
  a NULL pointer from skb->next, skb_headlen(list_skb) dereference
  NULL/corrupted pointers occurs.

Call Trace:
 skb_headlen(NULL skb)
 skb_segment
 tcp_gso_segment
 tcp4_gso_segment
 inet_gso_segment
 skb_mac_gso_segment
 __skb_gso_segment
 skb_gso_segment
 validate_xmit_skb
 validate_xmit_skb_list
 sch_direct_xmit
 qdisc_restart
 __qdisc_run
 qdisc_run
 net_tx_action

Fix: Add NAPI_GRO_CB(skb)->flush validation to the early-return check in
skb_gro_receive_list(), matching the defensive programming pattern of
skb_gro_receive().

Fixes: 8928756d53d5 ("net: add fraglist GRO/GSO support")
Cc: stable@vger.kernel.org
Signed-off-by: Shiming Cheng <shiming.cheng@mediatek.com>
---
 net/core/gro.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/gro.c b/net/core/gro.c
index 35f2f708f010..076247c1e662 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -229,7 +229,8 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
 
 int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb)
 {
-	if (unlikely(p->len + skb->len >= 65536))
+	if (unlikely(p->len + skb->len >= 65536 ||
+		     NAPI_GRO_CB(skb)->flush))
 		return -E2BIG;
 
 	if (!pskb_may_pull(skb, skb_gro_offset(skb))) {
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] Subject: [PATCH] net: gro: fix double aggregation of flush-marked skbs
  2026-06-30  2:35 [PATCH v3] Subject: [PATCH] net: gro: fix double aggregation of flush-marked skbs Shiming Cheng
@ 2026-07-02 10:02 ` Paolo Abeni
  2026-07-03  1:26   ` Shiming Cheng (成诗明)
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Abeni @ 2026-07-02 10:02 UTC (permalink / raw)
  To: Shiming Cheng, davem, edumazet, kuba, horms, matthias.bgg,
	angelogioacchino.delregno, willemb, daniel.zahka, alice, sd,
	eilaimemedsnaimel, imv4bel, nbd, dsahern, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek
  Cc: stable, lena.wang

Note: the patch subject is quite uncorrected

On 6/30/26 4:35 AM, Shiming Cheng wrote:
> The new skb_gro_receive_list() function is missing a critical safety check
> present in the legacy skb_gro_receive() path. Specifically, it does not
> validate NAPI_GRO_CB(skb)->flush before allowing packet aggregation.

skb_gro_receive_list() is not very "new" and definitely
skb_gro_receive() is not legacy.

> This allows already-GRO'd packets with existing frag_list to be
> re-aggregated into a new GRO session, corrupting the frag_list chain
> structure. When skb_segment() attempts to unpack these malformed packets,
> it encounters invalid state and triggers a kernel panic.
> 
> Scenario (Tethering/Device forwarding):
>   1. Driver: Generated aggregated packet P1 via LRO with frag_list
>   2. Dev A: Receives aggregated fraglist packet and flush flag set
>   3. Dev A: Re-enters GRO, skb_gro_receive_list() is called
>   4. Missing flush check allows re-aggregation despite flush flag
>   5. Frag_list chain becomes corrupted (loops or dangling refs)
>   6. Dev B: TX path calls skb_segment(), crashes on corrupted frag_list

I can't parse the above. Is this something that can happen with in-tree
drivers or do you need OoT module to trigger it? In any case please
clarify the actual order and the involved driver. Possibly a stack
strace leading to the critical aggregation could help.

> Fix: Add NAPI_GRO_CB(skb)->flush validation to the early-return check in
> skb_gro_receive_list(), matching the defensive programming pattern of
> skb_gro_receive().
> 
> Fixes: 8928756d53d5 ("net: add fraglist GRO/GSO support")

The fix tag is wrong, should be:

Fixes: 3a1296a38d0c ('net: Support GRO/GSO fraglist chaining.')

/P



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] Subject: [PATCH] net: gro: fix double aggregation of flush-marked skbs
  2026-07-02 10:02 ` Paolo Abeni
@ 2026-07-03  1:26   ` Shiming Cheng (成诗明)
  0 siblings, 0 replies; 3+ messages in thread
From: Shiming Cheng (成诗明) @ 2026-07-03  1:26 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org, dsahern@kernel.org,
	imv4bel@gmail.com, linux-mediatek@lists.infradead.org,
	alice@isovalent.com, daniel.zahka@gmail.com,
	eilaimemedsnaimel@gmail.com, nbd@nbd.name, horms@kernel.org,
	kuba@kernel.org, willemb@google.com, pabeni@redhat.com,
	edumazet@google.com, netdev@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, matthias.bgg@gmail.com,
	davem@davemloft.net, AngeloGioacchino Del Regno,
	sd@queasysnail.net
  Cc: Lena Wang (王娜), stable@vger.kernel.org

On Thu, 2026-07-02 at 12:02 +0200, Paolo Abeni wrote:
> Note: the patch subject is quite uncorrected
> 
> On 6/30/26 4:35 AM, Shiming Cheng wrote:
> > The new skb_gro_receive_list() function is missing a critical
> > safety check
> > present in the legacy skb_gro_receive() path. Specifically, it does
> > not
> > validate NAPI_GRO_CB(skb)->flush before allowing packet
> > aggregation.
> 
> skb_gro_receive_list() is not very "new" and definitely
> skb_gro_receive() is not legacy.
> 

The wording here may need to be adjusted. I'm referring to the
chronological order/which one came first.

Updated:
The skb_gro_receive_list() function is missing a critical safety check
that exists in the skb_gro_receive() implementation. Specifically, it
does not validate NAPI_GRO_CB(skb)->flush before allowing packet
aggregation

> > This allows already-GRO'd packets with existing frag_list to be
> > re-aggregated into a new GRO session, corrupting the frag_list
> > chain
> > structure. When skb_segment() attempts to unpack these malformed
> > packets,
> > it encounters invalid state and triggers a kernel panic.
> > 
> > Scenario (Tethering/Device forwarding):
> >   1. Driver: Generated aggregated packet P1 via LRO with frag_list
> >   2. Dev A: Receives aggregated fraglist packet and flush flag set
> >   3. Dev A: Re-enters GRO, skb_gro_receive_list() is called
> >   4. Missing flush check allows re-aggregation despite flush flag
> >   5. Frag_list chain becomes corrupted (loops or dangling refs)
> >   6. Dev B: TX path calls skb_segment(), crashes on corrupted
> > frag_list
> 
> I can't parse the above. Is this something that can happen with in-
> tree
> drivers or do you need OoT module to trigger it? In any case please
> clarify the actual order and the involved driver. Possibly a stack
> strace leading to the critical aggregation could help.
> 

We are hitting a GRO/LRO-related failure in a tethering scenario. 

On the RX path, the driver performs an LRO-style aggregation before
handing packets to the stack. When `nfrags` exceeds 17, additional
packets are no longer appended to the frags array, but are attached
through `skb_shared_info(skb)->frag_list`. After that, the driver still
passes the skb into `napi_gro_receive()`, so the same traffic goes
through a second aggregation stage in GRO.

In our tethering case, `NAPI_GRO_CB(skb)->is_flist = !sk`, so
`is_flist` becomes `true`, and the skb follows the `SKB_GSO_FRAGLIST`
path, eventually reaching `skb_gro_receive_list()`. The issue is that
some later skbs may already carry their own `frag_list` as a result of
the first aggregation done by the driver. When GRO links those skbs
again into a new `frag_list` chain, the resulting skb layout becomes
more complex than expected and eventually triggers the kernel
exception.

Actual skb relationships when the issue occurs is as follows.
A->frag_list = B
B->next      = C
C->frag_list = D

In the observed layout, A already links `B -> C` through `frag_list`,
while C itself still carries its own `frag_list -> D`. In other words,
when GRO continues chaining skbs in `skb_gro_receive_list()`, the later
skb is no longer a simple standalone packet, but an skb that already
carries `shared_info->frag_list` from the driver-side LRO stage. This
creates a nested `frag_list` layout and eventually triggers the kernel
exception in our case.

> > Fix: Add NAPI_GRO_CB(skb)->flush validation to the early-return
> > check in
> > skb_gro_receive_list(), matching the defensive programming pattern
> > of
> > skb_gro_receive().
> > 
> > Fixes: 8928756d53d5 ("net: add fraglist GRO/GSO support")
> 
> The fix tag is wrong, should be:
> 
> Fixes: 3a1296a38d0c ('net: Support GRO/GSO fraglist chaining.')
> 

I will update it in the next patch.
> /P
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-07-03  1:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30  2:35 [PATCH v3] Subject: [PATCH] net: gro: fix double aggregation of flush-marked skbs Shiming Cheng
2026-07-02 10:02 ` Paolo Abeni
2026-07-03  1:26   ` Shiming Cheng (成诗明)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox