Linux kernel -stable discussions
 help / color / mirror / Atom feed
* [PATCH net] net: skbuff: propagate shared-frag marker through pskb_copy()
@ 2026-05-13 11:25 Hyunwoo Kim
  2026-05-13 16:21 ` Ben Hutchings
  0 siblings, 1 reply; 5+ messages in thread
From: Hyunwoo Kim @ 2026-05-13 11:25 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, steffen.klassert, herbert, dsahern,
	vakzz
  Cc: stable, netdev, imv4bel

__pskb_copy_fclone() shallow-copies the source's frag descriptors and
bumps each page's refcount via skb_frag_ref(), then defers the rest
of the shinfo metadata to skb_copy_header().  That helper only carries
over gso_{size,segs,type} and never touches skb_shinfo()->flags, so
the destination skb keeps a reference to the same externally-owned or
page-cache-backed pages while reporting skb_has_shared_frag() as
false.

The mismatch is harmful in any in-place writer that uses
skb_has_shared_frag() to decide whether shared pages must be detoured
through skb_cow_data().  ESP input is one such writer (esp4.c,
esp6.c), and a single nft 'dup to <local>' rule -- or any other
nf_dup_ipv4() / xt_TEE caller -- is enough to land a pskb_copy()'d
skb in esp_input() with the marker stripped, letting an unprivileged
user write into the page cache of a root-owned read-only file via
authencesn-ESN stray writes.

Set SKBFL_SHARED_FRAG on the destination whenever frag descriptors
were actually moved from the source.  skb_copy() and skb_copy_expand()
share skb_copy_header() too but linearize all paged data into freshly
allocated head storage and emerge with nr_frags == 0, so
skb_has_shared_frag() returns false on its own; they need no change.

Fixes: cef401de7be8 ("net: fix possible wrong checksum generation")
Fixes: f4c50a4034e6 ("xfrm: esp: avoid in-place decrypt on shared skb frags")
Reported-by: William Bowling <vakzz@zellic.io>
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
 net/core/skbuff.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7dad68e3b518..15bdec53e8d9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2248,6 +2248,7 @@ struct sk_buff *__pskb_copy_fclone(struct sk_buff *skb, int headroom,
 			skb_frag_ref(skb, i);
 		}
 		skb_shinfo(n)->nr_frags = i;
+		skb_shinfo(n)->flags |= skb_shinfo(skb)->flags & SKBFL_SHARED_FRAG;
 	}
 
 	if (skb_has_frag_list(skb)) {
@@ -6200,6 +6201,8 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
 	       from_shinfo->frags,
 	       from_shinfo->nr_frags * sizeof(skb_frag_t));
 	to_shinfo->nr_frags += from_shinfo->nr_frags;
+	if (from_shinfo->nr_frags)
+		to_shinfo->flags |= from_shinfo->flags & SKBFL_SHARED_FRAG;
 
 	if (!skb_cloned(from))
 		from_shinfo->nr_frags = 0;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: skbuff: propagate shared-frag marker through pskb_copy()
  2026-05-13 11:25 [PATCH net] net: skbuff: propagate shared-frag marker through pskb_copy() Hyunwoo Kim
@ 2026-05-13 16:21 ` Ben Hutchings
  2026-05-13 16:24   ` Hyunwoo Kim
  2026-05-13 17:16   ` Hyunwoo Kim
  0 siblings, 2 replies; 5+ messages in thread
From: Ben Hutchings @ 2026-05-13 16:21 UTC (permalink / raw)
  To: Hyunwoo Kim, davem, edumazet, kuba, pabeni, steffen.klassert,
	herbert, dsahern, vakzz
  Cc: stable, netdev

[-- Attachment #1: Type: text/plain, Size: 2825 bytes --]

On Wed, 2026-05-13 at 20:25 +0900, Hyunwoo Kim wrote:
> __pskb_copy_fclone() shallow-copies the source's frag descriptors and
> bumps each page's refcount via skb_frag_ref(), then defers the rest
> of the shinfo metadata to skb_copy_header().  That helper only carries
> over gso_{size,segs,type} and never touches skb_shinfo()->flags, so
> the destination skb keeps a reference to the same externally-owned or
> page-cache-backed pages while reporting skb_has_shared_frag() as
> false.
>
> The mismatch is harmful in any in-place writer that uses
> skb_has_shared_frag() to decide whether shared pages must be detoured
> through skb_cow_data().  ESP input is one such writer (esp4.c,
> esp6.c), and a single nft 'dup to <local>' rule -- or any other
> nf_dup_ipv4() / xt_TEE caller -- is enough to land a pskb_copy()'d
> skb in esp_input() with the marker stripped, letting an unprivileged
> user write into the page cache of a root-owned read-only file via
> authencesn-ESN stray writes.
> 
> Set SKBFL_SHARED_FRAG on the destination whenever frag descriptors
> were actually moved from the source.  skb_copy() and skb_copy_expand()
> share skb_copy_header() too but linearize all paged data into freshly
> allocated head storage and emerge with nr_frags == 0, so
> skb_has_shared_frag() returns false on its own; they need no change.

What about skb_shift()?  It seems like that should also propagate this
flag.  But I could be missing some reason why it's not necessary.

Ben.

> Fixes: cef401de7be8 ("net: fix possible wrong checksum generation")
> Fixes: f4c50a4034e6 ("xfrm: esp: avoid in-place decrypt on shared skb frags")
> Reported-by: William Bowling <vakzz@zellic.io>
> Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> ---
>  net/core/skbuff.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 7dad68e3b518..15bdec53e8d9 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2248,6 +2248,7 @@ struct sk_buff *__pskb_copy_fclone(struct sk_buff *skb, int headroom,
>  			skb_frag_ref(skb, i);
>  		}
>  		skb_shinfo(n)->nr_frags = i;
> +		skb_shinfo(n)->flags |= skb_shinfo(skb)->flags & SKBFL_SHARED_FRAG;
>  	}
>  
>  	if (skb_has_frag_list(skb)) {
> @@ -6200,6 +6201,8 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
>  	       from_shinfo->frags,
>  	       from_shinfo->nr_frags * sizeof(skb_frag_t));
>  	to_shinfo->nr_frags += from_shinfo->nr_frags;
> +	if (from_shinfo->nr_frags)
> +		to_shinfo->flags |= from_shinfo->flags & SKBFL_SHARED_FRAG;
>  
>  	if (!skb_cloned(from))
>  		from_shinfo->nr_frags = 0;

-- 
Ben Hutchings
Tomorrow will be cancelled due to lack of interest.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: skbuff: propagate shared-frag marker through pskb_copy()
  2026-05-13 16:21 ` Ben Hutchings
@ 2026-05-13 16:24   ` Hyunwoo Kim
  2026-05-13 17:16   ` Hyunwoo Kim
  1 sibling, 0 replies; 5+ messages in thread
From: Hyunwoo Kim @ 2026-05-13 16:24 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: davem, edumazet, kuba, pabeni, steffen.klassert, herbert, dsahern,
	vakzz, stable, netdev, imv4bel

On Wed, May 13, 2026 at 06:21:45PM +0200, Ben Hutchings wrote:
> On Wed, 2026-05-13 at 20:25 +0900, Hyunwoo Kim wrote:
> > __pskb_copy_fclone() shallow-copies the source's frag descriptors and
> > bumps each page's refcount via skb_frag_ref(), then defers the rest
> > of the shinfo metadata to skb_copy_header().  That helper only carries
> > over gso_{size,segs,type} and never touches skb_shinfo()->flags, so
> > the destination skb keeps a reference to the same externally-owned or
> > page-cache-backed pages while reporting skb_has_shared_frag() as
> > false.
> >
> > The mismatch is harmful in any in-place writer that uses
> > skb_has_shared_frag() to decide whether shared pages must be detoured
> > through skb_cow_data().  ESP input is one such writer (esp4.c,
> > esp6.c), and a single nft 'dup to <local>' rule -- or any other
> > nf_dup_ipv4() / xt_TEE caller -- is enough to land a pskb_copy()'d
> > skb in esp_input() with the marker stripped, letting an unprivileged
> > user write into the page cache of a root-owned read-only file via
> > authencesn-ESN stray writes.
> > 
> > Set SKBFL_SHARED_FRAG on the destination whenever frag descriptors
> > were actually moved from the source.  skb_copy() and skb_copy_expand()
> > share skb_copy_header() too but linearize all paged data into freshly
> > allocated head storage and emerge with nr_frags == 0, so
> > skb_has_shared_frag() returns false on its own; they need no change.
> 
> What about skb_shift()?  It seems like that should also propagate this
> flag.  But I could be missing some reason why it's not necessary.

That is one of the things I am testing.


Best regards,
Hyunwoo Kim

> 
> Ben.
> 
> > Fixes: cef401de7be8 ("net: fix possible wrong checksum generation")
> > Fixes: f4c50a4034e6 ("xfrm: esp: avoid in-place decrypt on shared skb frags")
> > Reported-by: William Bowling <vakzz@zellic.io>
> > Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > ---
> >  net/core/skbuff.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 7dad68e3b518..15bdec53e8d9 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -2248,6 +2248,7 @@ struct sk_buff *__pskb_copy_fclone(struct sk_buff *skb, int headroom,
> >  			skb_frag_ref(skb, i);
> >  		}
> >  		skb_shinfo(n)->nr_frags = i;
> > +		skb_shinfo(n)->flags |= skb_shinfo(skb)->flags & SKBFL_SHARED_FRAG;
> >  	}
> >  
> >  	if (skb_has_frag_list(skb)) {
> > @@ -6200,6 +6201,8 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
> >  	       from_shinfo->frags,
> >  	       from_shinfo->nr_frags * sizeof(skb_frag_t));
> >  	to_shinfo->nr_frags += from_shinfo->nr_frags;
> > +	if (from_shinfo->nr_frags)
> > +		to_shinfo->flags |= from_shinfo->flags & SKBFL_SHARED_FRAG;
> >  
> >  	if (!skb_cloned(from))
> >  		from_shinfo->nr_frags = 0;
> 
> -- 
> Ben Hutchings
> Tomorrow will be cancelled due to lack of interest.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: skbuff: propagate shared-frag marker through pskb_copy()
  2026-05-13 16:21 ` Ben Hutchings
  2026-05-13 16:24   ` Hyunwoo Kim
@ 2026-05-13 17:16   ` Hyunwoo Kim
  2026-05-13 18:30     ` Hyunwoo Kim
  1 sibling, 1 reply; 5+ messages in thread
From: Hyunwoo Kim @ 2026-05-13 17:16 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: davem, edumazet, kuba, pabeni, steffen.klassert, herbert, dsahern,
	vakzz, stable, netdev, imv4bel

On Wed, May 13, 2026 at 06:21:45PM +0200, Ben Hutchings wrote:
> On Wed, 2026-05-13 at 20:25 +0900, Hyunwoo Kim wrote:
> > __pskb_copy_fclone() shallow-copies the source's frag descriptors and
> > bumps each page's refcount via skb_frag_ref(), then defers the rest
> > of the shinfo metadata to skb_copy_header().  That helper only carries
> > over gso_{size,segs,type} and never touches skb_shinfo()->flags, so
> > the destination skb keeps a reference to the same externally-owned or
> > page-cache-backed pages while reporting skb_has_shared_frag() as
> > false.
> >
> > The mismatch is harmful in any in-place writer that uses
> > skb_has_shared_frag() to decide whether shared pages must be detoured
> > through skb_cow_data().  ESP input is one such writer (esp4.c,
> > esp6.c), and a single nft 'dup to <local>' rule -- or any other
> > nf_dup_ipv4() / xt_TEE caller -- is enough to land a pskb_copy()'d
> > skb in esp_input() with the marker stripped, letting an unprivileged
> > user write into the page cache of a root-owned read-only file via
> > authencesn-ESN stray writes.
> > 
> > Set SKBFL_SHARED_FRAG on the destination whenever frag descriptors
> > were actually moved from the source.  skb_copy() and skb_copy_expand()
> > share skb_copy_header() too but linearize all paged data into freshly
> > allocated head storage and emerge with nr_frags == 0, so
> > skb_has_shared_frag() returns false on its own; they need no change.
> 
> What about skb_shift()?  It seems like that should also propagate this
> flag.  But I could be missing some reason why it's not necessary.

Yes, since skb_shift() is also a function that moves frag descriptors, 
I think SHARED_FRAG should be propagated as well. The actual trigger 
conditions are tricky (not deterministic) due to TCP write-queue skb 
merging, but I believe the fix is the right thing to do. 

I'm planning to submit a v2 patch. What do you think?


Best regards,
Hyunwoo Kim

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: skbuff: propagate shared-frag marker through pskb_copy()
  2026-05-13 17:16   ` Hyunwoo Kim
@ 2026-05-13 18:30     ` Hyunwoo Kim
  0 siblings, 0 replies; 5+ messages in thread
From: Hyunwoo Kim @ 2026-05-13 18:30 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: davem, edumazet, kuba, pabeni, steffen.klassert, herbert, dsahern,
	vakzz, stable, netdev, imv4bel

On Thu, May 14, 2026 at 02:16:31AM +0900, Hyunwoo Kim wrote:
> On Wed, May 13, 2026 at 06:21:45PM +0200, Ben Hutchings wrote:
> > On Wed, 2026-05-13 at 20:25 +0900, Hyunwoo Kim wrote:
> > > __pskb_copy_fclone() shallow-copies the source's frag descriptors and
> > > bumps each page's refcount via skb_frag_ref(), then defers the rest
> > > of the shinfo metadata to skb_copy_header().  That helper only carries
> > > over gso_{size,segs,type} and never touches skb_shinfo()->flags, so
> > > the destination skb keeps a reference to the same externally-owned or
> > > page-cache-backed pages while reporting skb_has_shared_frag() as
> > > false.
> > >
> > > The mismatch is harmful in any in-place writer that uses
> > > skb_has_shared_frag() to decide whether shared pages must be detoured
> > > through skb_cow_data().  ESP input is one such writer (esp4.c,
> > > esp6.c), and a single nft 'dup to <local>' rule -- or any other
> > > nf_dup_ipv4() / xt_TEE caller -- is enough to land a pskb_copy()'d
> > > skb in esp_input() with the marker stripped, letting an unprivileged
> > > user write into the page cache of a root-owned read-only file via
> > > authencesn-ESN stray writes.
> > > 
> > > Set SKBFL_SHARED_FRAG on the destination whenever frag descriptors
> > > were actually moved from the source.  skb_copy() and skb_copy_expand()
> > > share skb_copy_header() too but linearize all paged data into freshly
> > > allocated head storage and emerge with nr_frags == 0, so
> > > skb_has_shared_frag() returns false on its own; they need no change.
> > 
> > What about skb_shift()?  It seems like that should also propagate this
> > flag.  But I could be missing some reason why it's not necessary.
> 
> Yes, since skb_shift() is also a function that moves frag descriptors, 
> I think SHARED_FRAG should be propagated as well. The actual trigger 
> conditions are tricky (not deterministic) due to TCP write-queue skb 
> merging, but I believe the fix is the right thing to do. 
> 
> I'm planning to submit a v2 patch. What do you think?

And skb_gro_receive() also appears to need work. Further testing is 
in progress...

> 
> 
> Best regards,
> Hyunwoo Kim

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-13 18:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-13 11:25 [PATCH net] net: skbuff: propagate shared-frag marker through pskb_copy() Hyunwoo Kim
2026-05-13 16:21 ` Ben Hutchings
2026-05-13 16:24   ` Hyunwoo Kim
2026-05-13 17:16   ` Hyunwoo Kim
2026-05-13 18:30     ` Hyunwoo Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox