* [PATCH net v3] net: skbuff: propagate shared-frag marker through frag-transfer helpers
@ 2026-05-14 11:57 Hyunwoo Kim
2026-05-14 18:49 ` Sabrina Dubroca
0 siblings, 1 reply; 3+ messages in thread
From: Hyunwoo Kim @ 2026-05-14 11:57 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, horms, kerneljasonxing, kuniyu,
mhal, jiayuan.chen, steffen.klassert, vakzz, ben, herbert,
dsahern, sultan, sd
Cc: netdev, stable, imv4bel
Three frag-transfer helpers (__pskb_copy_fclone(), skb_try_coalesce(),
and skb_shift()) fail to propagate the SKBFL_SHARED_FRAG bit in
skb_shinfo()->flags when moving frags from source to destination.
__pskb_copy_fclone() defers the rest of the shinfo metadata to
skb_copy_header() after copying frag descriptors, but that helper
only carries over gso_{size,segs,type} and never touches
skb_shinfo()->flags; skb_try_coalesce() and skb_shift() move frag
descriptors directly and leave flags untouched. As a result, the
destination skb keeps a reference to the same externally-owned or
page-cache-backed pages while reporting skb_has_shared_frag() as
false.
The mismatch is harmful in any in-place writer that uses
skb_has_shared_frag() to decide whether shared pages must be detoured
through skb_cow_data(). ESP input is one such writer (esp4.c,
esp6.c), and a single nft 'dup to <local>' rule -- or any other
nf_dup_ipv4() / xt_TEE caller -- is enough to land a pskb_copy()'d
skb in esp_input() with the marker stripped, letting an unprivileged
user write into the page cache of a root-owned read-only file via
authencesn-ESN stray writes.
Set SKBFL_SHARED_FRAG on the destination whenever frag descriptors
were actually moved from the source. skb_copy() and skb_copy_expand()
share skb_copy_header() too but linearize all paged data into freshly
allocated head storage and emerge with nr_frags == 0, so
skb_has_shared_frag() returns false on its own; they need no change.
The same omission exists in skb_gro_receive() and skb_gro_receive_list().
The former moves the incoming skb's frag descriptors into the
accumulator's last sub-skb via two paths (a direct frag-move loop and
the head_frag + memcpy path); the latter chains the incoming skb whole
onto p's frag_list. Downstream skb_segment() reads only
skb_shinfo(p)->flags, and skb_segment_list() reuses each sub-skb's
shinfo as the nskb -- both p and lp must carry the marker.
Fixes: cef401de7be8 ("net: fix possible wrong checksum generation")
Fixes: f4c50a4034e6 ("xfrm: esp: avoid in-place decrypt on shared skb frags")
Reported-by: William Bowling <vakzz@zellic.io>
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
Changes in v3:
- Include the skb_gro_receive() audit patch suggested by Sultan
- v2: https://lore.kernel.org/all/agToIEDI4TaTNLRb@v4bel/
Changes in v2:
- Also propagate SHARED_FRAG in skb_try_coalesce() and skb_shift()
- v1: https://lore.kernel.org/all/agRfuVOeMI5pbHhY@v4bel/
---
net/core/gro.c | 4 ++++
net/core/skbuff.c | 5 +++++
2 files changed, 9 insertions(+)
diff --git a/net/core/gro.c b/net/core/gro.c
index 31d21de5b15a..9f8960789b2c 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -213,10 +213,12 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
p->data_len += len;
p->truesize += delta_truesize;
p->len += len;
+ skb_shinfo(p)->flags |= skbinfo->flags & SKBFL_SHARED_FRAG;
if (lp != p) {
lp->data_len += len;
lp->truesize += delta_truesize;
lp->len += len;
+ skb_shinfo(lp)->flags |= skbinfo->flags & SKBFL_SHARED_FRAG;
}
NAPI_GRO_CB(skb)->same_flow = 1;
return 0;
@@ -244,6 +246,8 @@ int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb)
p->truesize += skb->truesize;
p->len += skb->len;
+ skb_shinfo(p)->flags |= skb_shinfo(skb)->flags & SKBFL_SHARED_FRAG;
+
NAPI_GRO_CB(skb)->same_flow = 1;
return 0;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7dad68e3b518..7cd388504297 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2248,6 +2248,7 @@ struct sk_buff *__pskb_copy_fclone(struct sk_buff *skb, int headroom,
skb_frag_ref(skb, i);
}
skb_shinfo(n)->nr_frags = i;
+ skb_shinfo(n)->flags |= skb_shinfo(skb)->flags & SKBFL_SHARED_FRAG;
}
if (skb_has_frag_list(skb)) {
@@ -4349,6 +4350,8 @@ int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen)
tgt->ip_summed = CHECKSUM_PARTIAL;
skb->ip_summed = CHECKSUM_PARTIAL;
+ skb_shinfo(tgt)->flags |= skb_shinfo(skb)->flags & SKBFL_SHARED_FRAG;
+
skb_len_add(skb, -shiftlen);
skb_len_add(tgt, shiftlen);
@@ -6200,6 +6203,8 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
from_shinfo->frags,
from_shinfo->nr_frags * sizeof(skb_frag_t));
to_shinfo->nr_frags += from_shinfo->nr_frags;
+ if (from_shinfo->nr_frags)
+ to_shinfo->flags |= from_shinfo->flags & SKBFL_SHARED_FRAG;
if (!skb_cloned(from))
from_shinfo->nr_frags = 0;
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net v3] net: skbuff: propagate shared-frag marker through frag-transfer helpers
2026-05-14 11:57 [PATCH net v3] net: skbuff: propagate shared-frag marker through frag-transfer helpers Hyunwoo Kim
@ 2026-05-14 18:49 ` Sabrina Dubroca
2026-05-14 21:52 ` Hyunwoo Kim
0 siblings, 1 reply; 3+ messages in thread
From: Sabrina Dubroca @ 2026-05-14 18:49 UTC (permalink / raw)
To: Hyunwoo Kim
Cc: davem, edumazet, kuba, pabeni, horms, kerneljasonxing, kuniyu,
mhal, jiayuan.chen, steffen.klassert, vakzz, ben, herbert,
dsahern, sultan, netdev, stable
2026-05-14, 20:57:48 +0900, Hyunwoo Kim wrote:
> Three frag-transfer helpers (__pskb_copy_fclone(), skb_try_coalesce(),
> and skb_shift()) fail to propagate the SKBFL_SHARED_FRAG bit in
> skb_shinfo()->flags when moving frags from source to destination.
> __pskb_copy_fclone() defers the rest of the shinfo metadata to
> skb_copy_header() after copying frag descriptors, but that helper
> only carries over gso_{size,segs,type} and never touches
> skb_shinfo()->flags; skb_try_coalesce() and skb_shift() move frag
> descriptors directly and leave flags untouched. As a result, the
> destination skb keeps a reference to the same externally-owned or
> page-cache-backed pages while reporting skb_has_shared_frag() as
> false.
>
> The mismatch is harmful in any in-place writer that uses
> skb_has_shared_frag() to decide whether shared pages must be detoured
> through skb_cow_data(). ESP input is one such writer (esp4.c,
> esp6.c), and a single nft 'dup to <local>' rule -- or any other
> nf_dup_ipv4() / xt_TEE caller -- is enough to land a pskb_copy()'d
> skb in esp_input() with the marker stripped, letting an unprivileged
> user write into the page cache of a root-owned read-only file via
> authencesn-ESN stray writes.
>
> Set SKBFL_SHARED_FRAG on the destination whenever frag descriptors
> were actually moved from the source. skb_copy() and skb_copy_expand()
> share skb_copy_header() too but linearize all paged data into freshly
> allocated head storage and emerge with nr_frags == 0, so
> skb_has_shared_frag() returns false on its own; they need no change.
>
> The same omission exists in skb_gro_receive() and skb_gro_receive_list().
> The former moves the incoming skb's frag descriptors into the
> accumulator's last sub-skb via two paths (a direct frag-move loop and
> the head_frag + memcpy path); the latter chains the incoming skb whole
> onto p's frag_list. Downstream skb_segment() reads only
> skb_shinfo(p)->flags, and skb_segment_list() reuses each sub-skb's
> shinfo as the nskb -- both p and lp must carry the marker.
>
> Fixes: cef401de7be8 ("net: fix possible wrong checksum generation")
> Fixes: f4c50a4034e6 ("xfrm: esp: avoid in-place decrypt on shared skb frags")
> Reported-by: William Bowling <vakzz@zellic.io>
> Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> ---
> Changes in v3:
> - Include the skb_gro_receive() audit patch suggested by Sultan
> - v2: https://lore.kernel.org/all/agToIEDI4TaTNLRb@v4bel/
> Changes in v2:
> - Also propagate SHARED_FRAG in skb_try_coalesce() and skb_shift()
> - v1: https://lore.kernel.org/all/agRfuVOeMI5pbHhY@v4bel/
> ---
> net/core/gro.c | 4 ++++
> net/core/skbuff.c | 5 +++++
> 2 files changed, 9 insertions(+)
I think we should also be propagating SKBFL_SHARED_FRAG in
tcp_clone_payload(). It's copying frags from skbs in sk_write_queue to
a new skb in the same way as those functions you're fixing here.
-------- 8< --------
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index f9d8755705f7..6e4bb411dc04 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2626,6 +2626,7 @@ static int tcp_clone_payload(struct sock *sk, struct sk_buff *to,
todo = min_t(int, skb_frag_size(fragfrom),
probe_size - len);
len += todo;
+ skb_shinfo(to)->flags |= skb_shinfo(skb)->flags & SKBFL_SHARED_FRAG;
if (lastfrag &&
skb_frag_page(fragfrom) == skb_frag_page(lastfrag) &&
skb_frag_off(fragfrom) == skb_frag_off(lastfrag) +
--
Sabrina
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net v3] net: skbuff: propagate shared-frag marker through frag-transfer helpers
2026-05-14 18:49 ` Sabrina Dubroca
@ 2026-05-14 21:52 ` Hyunwoo Kim
0 siblings, 0 replies; 3+ messages in thread
From: Hyunwoo Kim @ 2026-05-14 21:52 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: davem, edumazet, kuba, pabeni, horms, kerneljasonxing, kuniyu,
mhal, jiayuan.chen, steffen.klassert, vakzz, ben, herbert,
dsahern, sultan, netdev, stable, imv4bel
On Thu, May 14, 2026 at 08:49:50PM +0200, Sabrina Dubroca wrote:
> 2026-05-14, 20:57:48 +0900, Hyunwoo Kim wrote:
> > Three frag-transfer helpers (__pskb_copy_fclone(), skb_try_coalesce(),
> > and skb_shift()) fail to propagate the SKBFL_SHARED_FRAG bit in
> > skb_shinfo()->flags when moving frags from source to destination.
> > __pskb_copy_fclone() defers the rest of the shinfo metadata to
> > skb_copy_header() after copying frag descriptors, but that helper
> > only carries over gso_{size,segs,type} and never touches
> > skb_shinfo()->flags; skb_try_coalesce() and skb_shift() move frag
> > descriptors directly and leave flags untouched. As a result, the
> > destination skb keeps a reference to the same externally-owned or
> > page-cache-backed pages while reporting skb_has_shared_frag() as
> > false.
> >
> > The mismatch is harmful in any in-place writer that uses
> > skb_has_shared_frag() to decide whether shared pages must be detoured
> > through skb_cow_data(). ESP input is one such writer (esp4.c,
> > esp6.c), and a single nft 'dup to <local>' rule -- or any other
> > nf_dup_ipv4() / xt_TEE caller -- is enough to land a pskb_copy()'d
> > skb in esp_input() with the marker stripped, letting an unprivileged
> > user write into the page cache of a root-owned read-only file via
> > authencesn-ESN stray writes.
> >
> > Set SKBFL_SHARED_FRAG on the destination whenever frag descriptors
> > were actually moved from the source. skb_copy() and skb_copy_expand()
> > share skb_copy_header() too but linearize all paged data into freshly
> > allocated head storage and emerge with nr_frags == 0, so
> > skb_has_shared_frag() returns false on its own; they need no change.
> >
> > The same omission exists in skb_gro_receive() and skb_gro_receive_list().
> > The former moves the incoming skb's frag descriptors into the
> > accumulator's last sub-skb via two paths (a direct frag-move loop and
> > the head_frag + memcpy path); the latter chains the incoming skb whole
> > onto p's frag_list. Downstream skb_segment() reads only
> > skb_shinfo(p)->flags, and skb_segment_list() reuses each sub-skb's
> > shinfo as the nskb -- both p and lp must carry the marker.
> >
> > Fixes: cef401de7be8 ("net: fix possible wrong checksum generation")
> > Fixes: f4c50a4034e6 ("xfrm: esp: avoid in-place decrypt on shared skb frags")
> > Reported-by: William Bowling <vakzz@zellic.io>
> > Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > ---
> > Changes in v3:
> > - Include the skb_gro_receive() audit patch suggested by Sultan
> > - v2: https://lore.kernel.org/all/agToIEDI4TaTNLRb@v4bel/
> > Changes in v2:
> > - Also propagate SHARED_FRAG in skb_try_coalesce() and skb_shift()
> > - v1: https://lore.kernel.org/all/agRfuVOeMI5pbHhY@v4bel/
> > ---
> > net/core/gro.c | 4 ++++
> > net/core/skbuff.c | 5 +++++
> > 2 files changed, 9 insertions(+)
>
> I think we should also be propagating SKBFL_SHARED_FRAG in
> tcp_clone_payload(). It's copying frags from skbs in sk_write_queue to
> a new skb in the same way as those functions you're fixing here.
Agreed. tcp_clone_payload() is the same pattern and the propagation is
missing.
On current mainline with v3 applied I couldn't find an obvious way to
drive this into a user-page write on its own, since the TX-side write
paths I checked all stage their writes through the linear region first
(via skb_ensure_writable / skb_cow_data).
That said, a future TX consumer that depends on the flag would regress
immediately, so the fix is right.
I'll wait for a bit more review on the current patch and then fold this
change into v4. Thank you.
Best regards,
Hyunwoo Kim
>
> -------- 8< --------
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index f9d8755705f7..6e4bb411dc04 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2626,6 +2626,7 @@ static int tcp_clone_payload(struct sock *sk, struct sk_buff *to,
> todo = min_t(int, skb_frag_size(fragfrom),
> probe_size - len);
> len += todo;
> + skb_shinfo(to)->flags |= skb_shinfo(skb)->flags & SKBFL_SHARED_FRAG;
> if (lastfrag &&
> skb_frag_page(fragfrom) == skb_frag_page(lastfrag) &&
> skb_frag_off(fragfrom) == skb_frag_off(lastfrag) +
>
> --
> Sabrina
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-14 21:52 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14 11:57 [PATCH net v3] net: skbuff: propagate shared-frag marker through frag-transfer helpers Hyunwoo Kim
2026-05-14 18:49 ` Sabrina Dubroca
2026-05-14 21:52 ` Hyunwoo Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox