* [PATCH v3] Subject: [PATCH] net: gro: fix double aggregation of flush-marked skbs
@ 2026-06-30 2:35 Shiming Cheng
2026-07-02 10:02 ` Paolo Abeni
0 siblings, 1 reply; 3+ messages in thread
From: Shiming Cheng @ 2026-06-30 2:35 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, horms, matthias.bgg,
angelogioacchino.delregno, willemb, daniel.zahka, alice, sd,
eilaimemedsnaimel, imv4bel, nbd, dsahern, netdev, linux-kernel,
linux-arm-kernel, linux-mediatek
Cc: stable, lena.wang, shiming.cheng
The new skb_gro_receive_list() function is missing a critical safety check
present in the legacy skb_gro_receive() path. Specifically, it does not
validate NAPI_GRO_CB(skb)->flush before allowing packet aggregation.
This allows already-GRO'd packets with existing frag_list to be
re-aggregated into a new GRO session, corrupting the frag_list chain
structure. When skb_segment() attempts to unpack these malformed packets,
it encounters invalid state and triggers a kernel panic.
Scenario (Tethering/Device forwarding):
1. Driver: Generated aggregated packet P1 via LRO with frag_list
2. Dev A: Receives aggregated fraglist packet and flush flag set
3. Dev A: Re-enters GRO, skb_gro_receive_list() is called
4. Missing flush check allows re-aggregation despite flush flag
5. Frag_list chain becomes corrupted (loops or dangling refs)
6. Dev B: TX path calls skb_segment(), crashes on corrupted frag_list
Root cause in skb_segment():
The check at line ~4891:
if (hsize <= 0 && i >= nfrags && skb_headlen(list_skb) &&
(skb_headlen(list_skb) == len || sg)) {
When frag_list is corrupted by double aggregation, when list_skb is
a NULL pointer from skb->next, skb_headlen(list_skb) dereference
NULL/corrupted pointers occurs.
Call Trace:
skb_headlen(NULL skb)
skb_segment
tcp_gso_segment
tcp4_gso_segment
inet_gso_segment
skb_mac_gso_segment
__skb_gso_segment
skb_gso_segment
validate_xmit_skb
validate_xmit_skb_list
sch_direct_xmit
qdisc_restart
__qdisc_run
qdisc_run
net_tx_action
Fix: Add NAPI_GRO_CB(skb)->flush validation to the early-return check in
skb_gro_receive_list(), matching the defensive programming pattern of
skb_gro_receive().
Fixes: 8928756d53d5 ("net: add fraglist GRO/GSO support")
Cc: stable@vger.kernel.org
Signed-off-by: Shiming Cheng <shiming.cheng@mediatek.com>
---
net/core/gro.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/core/gro.c b/net/core/gro.c
index 35f2f708f010..076247c1e662 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -229,7 +229,8 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb)
{
- if (unlikely(p->len + skb->len >= 65536))
+ if (unlikely(p->len + skb->len >= 65536 ||
+ NAPI_GRO_CB(skb)->flush))
return -E2BIG;
if (!pskb_may_pull(skb, skb_gro_offset(skb))) {
--
2.45.2
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH v3] Subject: [PATCH] net: gro: fix double aggregation of flush-marked skbs
2026-06-30 2:35 [PATCH v3] Subject: [PATCH] net: gro: fix double aggregation of flush-marked skbs Shiming Cheng
@ 2026-07-02 10:02 ` Paolo Abeni
2026-07-03 1:26 ` Shiming Cheng (成诗明)
0 siblings, 1 reply; 3+ messages in thread
From: Paolo Abeni @ 2026-07-02 10:02 UTC (permalink / raw)
To: Shiming Cheng, davem, edumazet, kuba, horms, matthias.bgg,
angelogioacchino.delregno, willemb, daniel.zahka, alice, sd,
eilaimemedsnaimel, imv4bel, nbd, dsahern, netdev, linux-kernel,
linux-arm-kernel, linux-mediatek
Cc: stable, lena.wang
Note: the patch subject is quite uncorrected
On 6/30/26 4:35 AM, Shiming Cheng wrote:
> The new skb_gro_receive_list() function is missing a critical safety check
> present in the legacy skb_gro_receive() path. Specifically, it does not
> validate NAPI_GRO_CB(skb)->flush before allowing packet aggregation.
skb_gro_receive_list() is not very "new" and definitely
skb_gro_receive() is not legacy.
> This allows already-GRO'd packets with existing frag_list to be
> re-aggregated into a new GRO session, corrupting the frag_list chain
> structure. When skb_segment() attempts to unpack these malformed packets,
> it encounters invalid state and triggers a kernel panic.
>
> Scenario (Tethering/Device forwarding):
> 1. Driver: Generated aggregated packet P1 via LRO with frag_list
> 2. Dev A: Receives aggregated fraglist packet and flush flag set
> 3. Dev A: Re-enters GRO, skb_gro_receive_list() is called
> 4. Missing flush check allows re-aggregation despite flush flag
> 5. Frag_list chain becomes corrupted (loops or dangling refs)
> 6. Dev B: TX path calls skb_segment(), crashes on corrupted frag_list
I can't parse the above. Is this something that can happen with in-tree
drivers or do you need OoT module to trigger it? In any case please
clarify the actual order and the involved driver. Possibly a stack
strace leading to the critical aggregation could help.
> Fix: Add NAPI_GRO_CB(skb)->flush validation to the early-return check in
> skb_gro_receive_list(), matching the defensive programming pattern of
> skb_gro_receive().
>
> Fixes: 8928756d53d5 ("net: add fraglist GRO/GSO support")
The fix tag is wrong, should be:
Fixes: 3a1296a38d0c ('net: Support GRO/GSO fraglist chaining.')
/P
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH v3] Subject: [PATCH] net: gro: fix double aggregation of flush-marked skbs
2026-07-02 10:02 ` Paolo Abeni
@ 2026-07-03 1:26 ` Shiming Cheng (成诗明)
0 siblings, 0 replies; 3+ messages in thread
From: Shiming Cheng (成诗明) @ 2026-07-03 1:26 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org, dsahern@kernel.org,
imv4bel@gmail.com, linux-mediatek@lists.infradead.org,
alice@isovalent.com, daniel.zahka@gmail.com,
eilaimemedsnaimel@gmail.com, nbd@nbd.name, horms@kernel.org,
kuba@kernel.org, willemb@google.com, pabeni@redhat.com,
edumazet@google.com, netdev@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, matthias.bgg@gmail.com,
davem@davemloft.net, AngeloGioacchino Del Regno,
sd@queasysnail.net
Cc: Lena Wang (王娜), stable@vger.kernel.org
On Thu, 2026-07-02 at 12:02 +0200, Paolo Abeni wrote:
> Note: the patch subject is quite uncorrected
>
> On 6/30/26 4:35 AM, Shiming Cheng wrote:
> > The new skb_gro_receive_list() function is missing a critical
> > safety check
> > present in the legacy skb_gro_receive() path. Specifically, it does
> > not
> > validate NAPI_GRO_CB(skb)->flush before allowing packet
> > aggregation.
>
> skb_gro_receive_list() is not very "new" and definitely
> skb_gro_receive() is not legacy.
>
The wording here may need to be adjusted. I'm referring to the
chronological order/which one came first.
Updated:
The skb_gro_receive_list() function is missing a critical safety check
that exists in the skb_gro_receive() implementation. Specifically, it
does not validate NAPI_GRO_CB(skb)->flush before allowing packet
aggregation
> > This allows already-GRO'd packets with existing frag_list to be
> > re-aggregated into a new GRO session, corrupting the frag_list
> > chain
> > structure. When skb_segment() attempts to unpack these malformed
> > packets,
> > it encounters invalid state and triggers a kernel panic.
> >
> > Scenario (Tethering/Device forwarding):
> > 1. Driver: Generated aggregated packet P1 via LRO with frag_list
> > 2. Dev A: Receives aggregated fraglist packet and flush flag set
> > 3. Dev A: Re-enters GRO, skb_gro_receive_list() is called
> > 4. Missing flush check allows re-aggregation despite flush flag
> > 5. Frag_list chain becomes corrupted (loops or dangling refs)
> > 6. Dev B: TX path calls skb_segment(), crashes on corrupted
> > frag_list
>
> I can't parse the above. Is this something that can happen with in-
> tree
> drivers or do you need OoT module to trigger it? In any case please
> clarify the actual order and the involved driver. Possibly a stack
> strace leading to the critical aggregation could help.
>
We are hitting a GRO/LRO-related failure in a tethering scenario.
On the RX path, the driver performs an LRO-style aggregation before
handing packets to the stack. When `nfrags` exceeds 17, additional
packets are no longer appended to the frags array, but are attached
through `skb_shared_info(skb)->frag_list`. After that, the driver still
passes the skb into `napi_gro_receive()`, so the same traffic goes
through a second aggregation stage in GRO.
In our tethering case, `NAPI_GRO_CB(skb)->is_flist = !sk`, so
`is_flist` becomes `true`, and the skb follows the `SKB_GSO_FRAGLIST`
path, eventually reaching `skb_gro_receive_list()`. The issue is that
some later skbs may already carry their own `frag_list` as a result of
the first aggregation done by the driver. When GRO links those skbs
again into a new `frag_list` chain, the resulting skb layout becomes
more complex than expected and eventually triggers the kernel
exception.
Actual skb relationships when the issue occurs is as follows.
A->frag_list = B
B->next = C
C->frag_list = D
In the observed layout, A already links `B -> C` through `frag_list`,
while C itself still carries its own `frag_list -> D`. In other words,
when GRO continues chaining skbs in `skb_gro_receive_list()`, the later
skb is no longer a simple standalone packet, but an skb that already
carries `shared_info->frag_list` from the driver-side LRO stage. This
creates a nested `frag_list` layout and eventually triggers the kernel
exception in our case.
> > Fix: Add NAPI_GRO_CB(skb)->flush validation to the early-return
> > check in
> > skb_gro_receive_list(), matching the defensive programming pattern
> > of
> > skb_gro_receive().
> >
> > Fixes: 8928756d53d5 ("net: add fraglist GRO/GSO support")
>
> The fix tag is wrong, should be:
>
> Fixes: 3a1296a38d0c ('net: Support GRO/GSO fraglist chaining.')
>
I will update it in the next patch.
> /P
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-07-03 1:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30 2:35 [PATCH v3] Subject: [PATCH] net: gro: fix double aggregation of flush-marked skbs Shiming Cheng
2026-07-02 10:02 ` Paolo Abeni
2026-07-03 1:26 ` Shiming Cheng (成诗明)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox