* [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers
@ 2026-05-21 12:16 lazyming
2026-05-23 8:58 ` lazyming
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: lazyming @ 2026-05-21 12:16 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, horms, w, security, linux-kernel,
lazyming, stable
pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
the old skb_shared_info header into a new buffer via memcpy(), which
includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
Neither function calls net_zcopy_get() for the new shinfo, creating an
unaccounted holder: every skb_shared_info with destructor_arg set will
call skb_zcopy_clear() once when freed, but the corresponding
net_zcopy_get() was never called for the new copy. Repeated calls
drive uarg->refcnt to zero prematurely, freeing ubuf_info_msgzc while
TX skbs still hold live destructor_arg pointers.
KASAN reports use-after-free on a freed ubuf_info_msgzc:
BUG: KASAN: slab-use-after-free in skb_release_data+0x77b/0x810
Read of size 8 at addr ffff88801574d3e8 by task poc/220
Call Trace:
skb_release_data+0x77b/0x810
kfree_skb_list_reason+0x13e/0x610
skb_release_data+0x4cd/0x810
sk_skb_reason_drop+0xf3/0x340
skb_queue_purge_reason+0x282/0x440
rds_tcp_inc_free+0x1e/0x30
rds_recvmsg+0x354/0x1780
__sys_recvmsg+0xdf/0x180
Allocated by task 219:
msg_zerocopy_realloc+0x157/0x7b0
tcp_sendmsg_locked+0x2892/0x3ba0
Freed by task 219:
ip_recv_error+0x74a/0xb10
tcp_recvmsg+0x475/0x530
The skb consuming the late access still referenced the same uarg via
shinfo->destructor_arg copied by pskb_carve_inside_nonlinear() without
a refcount bump. This has been verified to be reliably exploitable: a
working proof-of-concept achieves full root privilege escalation from
an unprivileged local user on a default kernel configuration.
The fix follows the pattern of pskb_expand_head() which has the same
memcpy/cloned structure. For pskb_carve_inside_header(), net_zcopy_get()
is placed after skb_orphan_frags() succeeds, so the orphan error path
needs no cleanup. For pskb_carve_inside_nonlinear(), net_zcopy_get() is
placed after all failure points and just before skb_release_data(), so
no error path needs cleanup at all -- matching pskb_expand_head() more
closely and avoiding the need for a balancing net_zcopy_put().
Fixes: 6fa01ccd8830 ("skbuff: Add pskb_extract() helper function")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: lazyming <minhnguyen.080505@gmail.com>
---
net/core/skbuff.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 44ac121cf..6a1a2c203 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6810,6 +6810,8 @@ static int pskb_carve_inside_header(struct sk_buff *skb, const u32 off,
skb_kfree_head(data);
return -ENOMEM;
}
+ if (skb_zcopy(skb))
+ net_zcopy_get(skb_zcopy(skb));
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
skb_frag_ref(skb, i);
if (skb_has_frag_list(skb))
@@ -6953,6 +6955,8 @@ static int pskb_carve_inside_nonlinear(struct sk_buff *skb, const u32 off,
skb_kfree_head(data);
return -ENOMEM;
}
+ if (skb_zcopy(skb))
+ net_zcopy_get(skb_zcopy(skb));
skb_release_data(skb, SKB_CONSUMED);
skb->head = data;
base-commit: 94e3dd6874bf04d5939bc8431b9f7852f3a4a121
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers
2026-05-21 12:16 [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers lazyming
@ 2026-05-23 8:58 ` lazyming
2026-05-24 13:37 ` Willem de Bruijn
2026-05-25 19:54 ` Jakub Kicinski
2 siblings, 0 replies; 8+ messages in thread
From: lazyming @ 2026-05-23 8:58 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, horms, w, security, linux-kernel,
stable
I reviewed the Sashiko AI review of this patch and it raised two
issues worth noting:
1. Stale shinfo after skb_orphan_frags (vhost_net path) -- this can
lead to UAF when skb_zcopy_clear() fires on a destructor_arg that
was already freed by skb_copy_ubufs().
2. SKBFL_MANAGED_FRAG_REFS page ref leak (io_uring path) -- page refs
from skb_frag_ref() are never released because skb_release_data()
skips __skb_frag_unref() when MANAGED_FRAG_REFS is set.
Both are pre-existing bugs in the carve helpers, not introduced by
this patch. This patch only fixes the missing net_zcopy_get() for the
MSG_ZEROCOPY TCP path (SKBFL_ZEROCOPY_ENABLE without SKBFL_SHARED_FRAG),
which is unrelated to either issue above.
Could you re-review this patch? Issue 1 in particular looks genuinely
dangerous and probably deserves a separate fix from someone familiar
with the vhost_net zcopy path.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers
2026-05-21 12:16 [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers lazyming
2026-05-23 8:58 ` lazyming
@ 2026-05-24 13:37 ` Willem de Bruijn
2026-05-24 14:06 ` Willem de Bruijn
2026-05-25 19:54 ` Jakub Kicinski
2 siblings, 1 reply; 8+ messages in thread
From: Willem de Bruijn @ 2026-05-24 13:37 UTC (permalink / raw)
To: lazyming, netdev
Cc: davem, edumazet, kuba, pabeni, horms, w, security, linux-kernel,
lazyming, stable
lazyming wrote:
> pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
> the old skb_shared_info header into a new buffer via memcpy(), which
> includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
These functions are not supposed to maintain zerocopy frags.
Both call skb_orphan_frags.
I think what may need to happen is to invert the order of that call
and the memcpy. Current code:
memcpy((struct skb_shared_info *)(data + size),
skb_shinfo(skb), offsetof(struct skb_shared_info, frags[0]));
if (skb_orphan_frags(skb, gfp_mask)) {
skb_kfree_head(data);
return -ENOMEM;
}
> Neither function calls net_zcopy_get() for the new shinfo, creating an
> unaccounted holder: every skb_shared_info with destructor_arg set will
> call skb_zcopy_clear() once when freed, but the corresponding
> net_zcopy_get() was never called for the new copy. Repeated calls
> drive uarg->refcnt to zero prematurely, freeing ubuf_info_msgzc while
> TX skbs still hold live destructor_arg pointers.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers
2026-05-24 13:37 ` Willem de Bruijn
@ 2026-05-24 14:06 ` Willem de Bruijn
2026-05-25 15:18 ` Willem de Bruijn
0 siblings, 1 reply; 8+ messages in thread
From: Willem de Bruijn @ 2026-05-24 14:06 UTC (permalink / raw)
To: Willem de Bruijn, lazyming, netdev
Cc: davem, edumazet, kuba, pabeni, horms, w, security, linux-kernel,
lazyming, stable, asml.silence, achender
Willem de Bruijn wrote:
> lazyming wrote:
> > pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
> > the old skb_shared_info header into a new buffer via memcpy(), which
> > includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
>
> These functions are not supposed to maintain zerocopy frags.
>
> Both call skb_orphan_frags.
>
> I think what may need to happen is to invert the order of that call
> and the memcpy. Current code:
>
> memcpy((struct skb_shared_info *)(data + size),
> skb_shinfo(skb), offsetof(struct skb_shared_info, frags[0]));
> if (skb_orphan_frags(skb, gfp_mask)) {
> skb_kfree_head(data);
> return -ENOMEM;
> }
Never mind. This actually corresponds to the first Sashiko report you
mentioned: if zerocopy skbs are converted, then the memcpy prior to
that call will have stale state.
For skbs where skb_orphan_frags does not do a deep copy, we do need to
take this extra reference.
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers
2026-05-24 14:06 ` Willem de Bruijn
@ 2026-05-25 15:18 ` Willem de Bruijn
2026-05-25 15:31 ` Willem de Bruijn
0 siblings, 1 reply; 8+ messages in thread
From: Willem de Bruijn @ 2026-05-25 15:18 UTC (permalink / raw)
To: Willem de Bruijn, Willem de Bruijn, lazyming, netdev
Cc: davem, edumazet, kuba, pabeni, horms, w, security, linux-kernel,
lazyming, stable, asml.silence, achender, mst, jasowang
Willem de Bruijn wrote:
> Willem de Bruijn wrote:
> > lazyming wrote:
> > > pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
> > > the old skb_shared_info header into a new buffer via memcpy(), which
> > > includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
> >
> > These functions are not supposed to maintain zerocopy frags.
> >
> > Both call skb_orphan_frags.
> >
> > I think what may need to happen is to invert the order of that call
> > and the memcpy. Current code:
> >
> > memcpy((struct skb_shared_info *)(data + size),
> > skb_shinfo(skb), offsetof(struct skb_shared_info, frags[0]));
> > if (skb_orphan_frags(skb, gfp_mask)) {
> > skb_kfree_head(data);
> > return -ENOMEM;
> > }
>
> Never mind. This actually corresponds to the first Sashiko report you
> mentioned: if zerocopy skbs are converted, then the memcpy prior to
> that call will have stale state.
>
> For skbs where skb_orphan_frags does not do a deep copy, we do need to
> take this extra reference.
>
> Reviewed-by: Willem de Bruijn <willemb@google.com>
Not sure the potential preexisting issue is reachable.
Vhost-net and other zerocopy that predates MSG_ZEROCOPY does not
refcount ubuf_info. Instead it calls skb_copy_ubufs on skb_clone.
So if such an skb reaches pskb_expand_head, it should be guaranteed to
not be a clone. Same for the carve methods added later.
But, the commit that added zerocopy, commit a6686f2f382b
("skbuff: skb supports zero-copy buffers"), included this
pksb_expand_head call to skb_copy_ubufs from the start. That implies
that was expected to be reachable. I just don't see how yet.
If it is reachable, then all that is needed is to clear shinfo->flags.
Or more neatly,
skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers
2026-05-25 15:18 ` Willem de Bruijn
@ 2026-05-25 15:31 ` Willem de Bruijn
2026-05-25 17:31 ` Willem de Bruijn
0 siblings, 1 reply; 8+ messages in thread
From: Willem de Bruijn @ 2026-05-25 15:31 UTC (permalink / raw)
To: Willem de Bruijn, Willem de Bruijn, Willem de Bruijn, lazyming,
netdev
Cc: davem, edumazet, kuba, pabeni, horms, w, security, linux-kernel,
lazyming, stable, asml.silence, achender, mst, jasowang
Willem de Bruijn wrote:
> Willem de Bruijn wrote:
> > Willem de Bruijn wrote:
> > > lazyming wrote:
> > > > pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
> > > > the old skb_shared_info header into a new buffer via memcpy(), which
> > > > includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
> > >
> > > These functions are not supposed to maintain zerocopy frags.
> > >
> > > Both call skb_orphan_frags.
> > >
> > > I think what may need to happen is to invert the order of that call
> > > and the memcpy. Current code:
> > >
> > > memcpy((struct skb_shared_info *)(data + size),
> > > skb_shinfo(skb), offsetof(struct skb_shared_info, frags[0]));
> > > if (skb_orphan_frags(skb, gfp_mask)) {
> > > skb_kfree_head(data);
> > > return -ENOMEM;
> > > }
> >
> > Never mind. This actually corresponds to the first Sashiko report you
> > mentioned: if zerocopy skbs are converted, then the memcpy prior to
> > that call will have stale state.
> >
> > For skbs where skb_orphan_frags does not do a deep copy, we do need to
> > take this extra reference.
> >
> > Reviewed-by: Willem de Bruijn <willemb@google.com>
>
> Not sure the potential preexisting issue is reachable.
>
> Vhost-net and other zerocopy that predates MSG_ZEROCOPY does not
> refcount ubuf_info. Instead it calls skb_copy_ubufs on skb_clone.
>
> So if such an skb reaches pskb_expand_head, it should be guaranteed to
> not be a clone. Same for the carve methods added later.
>
> But, the commit that added zerocopy, commit a6686f2f382b
> ("skbuff: skb supports zero-copy buffers"), included this
> pksb_expand_head call to skb_copy_ubufs from the start. That implies
> that was expected to be reachable. I just don't see how yet.
>
> If it is reachable, then all that is needed is to clear shinfo->flags.
> Or more neatly,
>
> skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
Also, I'm not the expert on more recent managed frags
(SKBFL_MANAGED_FRAG_REFS).
That calls skb_zcopy_downgrade_managed in pskb_expand_head, but not in
the two other functions with memcpy before skb_copy_ubufs:
pskb_carve_inside_header and pskb_carve_inside_nonlinear.
I assume because those shorten the skb, so no risk of getting mixed
mode refcounted and non-refcounted frags?
In general zerocopy can be split in refcounted and non-refcounted.
Refcounted zerocopy will not downgrade in these cases, so will not
modify shinfo->flags after memcpy.
Non-refcounted should always get converted to copy in skb_clone,
so will not enter the skb_cloned() branch here.
If in doubt maybe warrants a rare WARN_ON_ONCE patch.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers
2026-05-25 15:31 ` Willem de Bruijn
@ 2026-05-25 17:31 ` Willem de Bruijn
0 siblings, 0 replies; 8+ messages in thread
From: Willem de Bruijn @ 2026-05-25 17:31 UTC (permalink / raw)
To: Willem de Bruijn, Willem de Bruijn, Willem de Bruijn,
Willem de Bruijn, lazyming, netdev
Cc: davem, edumazet, kuba, pabeni, horms, w, security, linux-kernel,
lazyming, stable, asml.silence, achender, mst, jasowang
Willem de Bruijn wrote:
> Willem de Bruijn wrote:
> > Willem de Bruijn wrote:
> > > Willem de Bruijn wrote:
> > > > lazyming wrote:
> > > > > pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
> > > > > the old skb_shared_info header into a new buffer via memcpy(), which
> > > > > includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
> > > >
> > > > These functions are not supposed to maintain zerocopy frags.
> > > >
> > > > Both call skb_orphan_frags.
> > > >
> > > > I think what may need to happen is to invert the order of that call
> > > > and the memcpy. Current code:
> > > >
> > > > memcpy((struct skb_shared_info *)(data + size),
> > > > skb_shinfo(skb), offsetof(struct skb_shared_info, frags[0]));
> > > > if (skb_orphan_frags(skb, gfp_mask)) {
> > > > skb_kfree_head(data);
> > > > return -ENOMEM;
> > > > }
> > >
> > > Never mind. This actually corresponds to the first Sashiko report you
> > > mentioned: if zerocopy skbs are converted, then the memcpy prior to
> > > that call will have stale state.
> > >
> > > For skbs where skb_orphan_frags does not do a deep copy, we do need to
> > > take this extra reference.
> > >
> > > Reviewed-by: Willem de Bruijn <willemb@google.com>
> >
> > Not sure the potential preexisting issue is reachable.
> >
> > Vhost-net and other zerocopy that predates MSG_ZEROCOPY does not
> > refcount ubuf_info. Instead it calls skb_copy_ubufs on skb_clone.
> >
> > So if such an skb reaches pskb_expand_head, it should be guaranteed to
> > not be a clone. Same for the carve methods added later.
> >
> > But, the commit that added zerocopy, commit a6686f2f382b
> > ("skbuff: skb supports zero-copy buffers"), included this
> > pksb_expand_head call to skb_copy_ubufs from the start. That implies
> > that was expected to be reachable. I just don't see how yet.
> >
> > If it is reachable, then all that is needed is to clear shinfo->flags.
> > Or more neatly,
> >
> > skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
>
> Also, I'm not the expert on more recent managed frags
> (SKBFL_MANAGED_FRAG_REFS).
>
> That calls skb_zcopy_downgrade_managed in pskb_expand_head, but not in
> the two other functions with memcpy before skb_copy_ubufs:
> pskb_carve_inside_header and pskb_carve_inside_nonlinear.
>
> I assume because those shorten the skb, so no risk of getting mixed
> mode refcounted and non-refcounted frags?
>
> In general zerocopy can be split in refcounted and non-refcounted.
>
> Refcounted zerocopy will not downgrade in these cases, so will not
> modify shinfo->flags after memcpy.
>
> Non-refcounted should always get converted to copy in skb_clone,
> so will not enter the skb_cloned() branch here.
>
> If in doubt maybe warrants a rare WARN_ON_ONCE patch.
I was unable to find a path also with the help of Gemini.
It did spot an interesting case where a cloned unrefcounted zerocopy
skb can be created. But it is reduced to uncloned immediately.
skb_morph is like skb_clone (calls __skb_clone), but skips the
skb_orphan_frags check. Its only caller ensures that this is safe.
But I'm inclined to send a patch to net-next that makes skb_morph
itself safe, by orphaning as well (and updating the caller to
gracefully handle skb_morph failure).
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers
2026-05-21 12:16 [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers lazyming
2026-05-23 8:58 ` lazyming
2026-05-24 13:37 ` Willem de Bruijn
@ 2026-05-25 19:54 ` Jakub Kicinski
2 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2026-05-25 19:54 UTC (permalink / raw)
To: lazyming
Cc: netdev, davem, edumazet, pabeni, horms, w, security, linux-kernel,
stable
On Thu, 21 May 2026 19:16:28 +0700 lazyming wrote:
> From: lazyming <minhnguyen.080505@gmail.com>
Your email address seems to indicate that your real name is not "lazy.."
Please repost with the From / Author and Signed-off-by lines containing
your real (some approximation of "legal") name. Unicode characters are
accepted.
--
pw-bot: cr
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-05-25 19:54 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-21 12:16 [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers lazyming
2026-05-23 8:58 ` lazyming
2026-05-24 13:37 ` Willem de Bruijn
2026-05-24 14:06 ` Willem de Bruijn
2026-05-25 15:18 ` Willem de Bruijn
2026-05-25 15:31 ` Willem de Bruijn
2026-05-25 17:31 ` Willem de Bruijn
2026-05-25 19:54 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox