From: Jakub Kicinski <kuba@kernel.org>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Rajat Gupta <rajat.gupta@oss.qualcomm.com>,
netdev@vger.kernel.org, davem@davemloft.net, edumazet@google.com,
pabeni@redhat.com, horms@kernel.org, jiri@resnulli.us,
yimingqian591@gmail.com, keenanat2000@gmail.com,
2045gemini@gmail.com, rollkingzzc@gmail.com
Subject: Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
Date: Mon, 25 May 2026 10:34:43 -0700 [thread overview]
Message-ID: <20260525103443.1da3e406@kernel.org> (raw)
In-Reply-To: <CAM0EoMmjg+qrc=E+1nSVcCUPFh9yUfw59ASLCOM4jcEvRtw4jg@mail.gmail.com>
On Mon, 25 May 2026 12:22:40 -0400 Jamal Hadi Salim wrote:
> On Mon, May 25, 2026 at 11:39 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > So as an alternative to the piece i posted? i.e this:
> > >
> > > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> > > index 79921b8d89ba..8f0f84b50c85 100644
> > > --- a/net/sched/act_pedit.c
> > > +++ b/net/sched/act_pedit.c
> > > @@ -474,6 +474,12 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> > > if (write_offset < 0) {
> > > if (skb_cow(skb, -write_offset))
> > > goto bad;
> > > + if (write_offset + (int)sizeof(hdata) > 0) {
> > > + if (skb_ensure_writable(skb,
> > > + min_t(int, skb->len,
> > > + write_offset + (int)sizeof(hdata))))
> > > + goto bad;
> > > + }
> > > } else {
> > > if (unlikely(check_add_overflow(write_offset,
> > > (int)sizeof(hdata),
> >
> > Yup! Even better.
>
> Dude, it's hard to follow you sometimes ;-> It's hard to grok what you
> mean the problem of "we are writing to frags".
Long threads have the tendency of losing focus.
Better to just repost the whole diff than tossing snippets at some point.. ;)
> Let me try to be verbose and you can narrow it down. There are _two_
> codelets dealing with "frags" both of which can be written to.
>
> 1) The patch from Rajat deals with zero copy from app with shared
> flags. That's whats being exploited in the wild.
> There's a piece of code there that does this in Rajat's patch to handle it:
>
> + /*
> + * If the skb has shared frags the user is likely using zero-copy
> + * (e.g. sendfile). Those page frags may point to page-cache pages;
> + * writing into them would silently corrupt the page cache.
> + * Linearize so pedit operates on a private copy.
> + * TL;DR: if you want zero-copy, don't use pedit.
> + */
> + if (skb_has_shared_frag(skb)) {
> + if (__skb_linearize(skb))
> + goto bad;
> + }
>
> After you posted, i thought this is the "we are writing to frags"
> issue you were referring to.
> I if-zeroed that code and indeed it does not seem that the exploit can
> be executed even with that taken out.
I don't want the skb_has_shared_frag() being added, that's my ask.
TBH I missed that skb_ensure_writable() when reading the patch.
If we use skb_ensure_writable() before all the writes we can delete
the skb_has_shared_frag() (as your test proves). This is the extent
to which I care about the patch, anything else - no strong opinion :)
> 2) There's the frags coming from the other (driver) direction in
> particular when skbs get cloned. That is what sashiko2 (nipa variant)
> pointed out. IOW, there's no active report for that specific one but
> Han Guidong responded after i posted this:
>
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -474,6 +474,12 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> if (write_offset < 0) {
> if (skb_cow(skb, -write_offset))
> goto bad;
> + if (write_offset + (int)sizeof(hdata) > 0) {
> + if (skb_ensure_writable(skb,
> + min_t(int, skb->len,
> + write_offset + (int)sizeof(hdata))))
> + goto bad;
> + }
> } else {
> if (unlikely(check_add_overflow(write_offset,
> (int)sizeof(hdata),
>
> Saying he was able to recreate that scenario with a kernel module. And
> that this patchlet fixed it.
>
> Hope you are still following at this point ;->
> So when you said we can use skb pulls - I thought you were referring
> to removing the above patchlet and instead to use an skb pull approach
> (for which you posted a sample patch).
Yes, skb_ensure_writable() does a pull already. So the only problem
with existing patch was that the negative offset branch was missing
a skb_ensure_writable(). Your snippet added it, plugging that hole,
so skb_has_shared_frag() can now be 100% safely removed. Hence my
"LGTM".
Please also remove the skb_store_bits(), and skb_header_pointer().
Unless I'm missing something these are dead code.
> I mentioned the two issues:
> 1) It will likely break the negative offset that work with pedit
> already (skb pull could conceivably be tricked to assume a large
> positive number)
Your snippet looked fine tho unnecessarily complex in practice.
(AI generated?). I'd go with skb_ensure_writable(sizeof(*ptr))
as the worst case. Packets shorter than 4B are irrelevant in practice.
But again, up to you.
> 2) that skb clones could result in writting into the shared data
> (which i said i may be overthinking).
>
> So which one of the two are you referring to? Or maybe it is both.
> Should we keep #1? or this the one that should be replaced?
> Are you ok with patchlet for number #2? Or do you want that replaced?
>
> Provide as much context as you can so we dont go back and forth ;->
next prev parent reply other threads:[~2026-05-25 17:34 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-18 1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta
2026-05-18 13:10 ` Han Guidong
2026-05-18 13:31 ` Jamal Hadi Salim
2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
2026-05-19 11:18 ` Toke Høiland-Jørgensen
2026-05-19 15:10 ` Han Guidong
2026-05-20 9:12 ` Jamal Hadi Salim
2026-05-20 10:04 ` Han Guidong
2026-05-20 10:36 ` Han Guidong
2026-05-20 11:40 ` Jamal Hadi Salim
2026-05-20 9:23 ` Jamal Hadi Salim
2026-05-20 20:00 ` Jamal Hadi Salim
2026-05-21 9:53 ` Jamal Hadi Salim
2026-05-21 10:15 ` Jamal Hadi Salim
2026-05-21 14:35 ` Jakub Kicinski
2026-05-21 15:16 ` Jamal Hadi Salim
2026-05-21 15:46 ` Jakub Kicinski
2026-05-22 11:47 ` Jamal Hadi Salim
2026-05-22 15:46 ` Jakub Kicinski
2026-05-22 16:37 ` Jamal Hadi Salim
2026-05-22 17:01 ` Jamal Hadi Salim
2026-05-23 0:55 ` Jakub Kicinski
2026-05-23 12:07 ` Jamal Hadi Salim
2026-05-23 12:13 ` Jamal Hadi Salim
2026-05-23 16:46 ` Jakub Kicinski
2026-05-23 16:57 ` Jamal Hadi Salim
2026-05-25 15:39 ` Jakub Kicinski
2026-05-25 16:22 ` Jamal Hadi Salim
2026-05-25 17:34 ` Jakub Kicinski [this message]
2026-05-25 19:03 ` Jamal Hadi Salim
[not found] ` <CAKa-r6soz=iMBiYG0Grhhc12yhdw9vMNV+XjjEPCmtgKK6+rhA@mail.gmail.com>
2026-05-21 15:56 ` Jakub Kicinski
2026-05-22 11:49 ` Jamal Hadi Salim
2026-05-22 12:00 ` Toke Høiland-Jørgensen
2026-05-22 14:49 ` Davide Caratti
2026-05-22 7:49 ` Han Guidong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260525103443.1da3e406@kernel.org \
--to=kuba@kernel.org \
--cc=2045gemini@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=keenanat2000@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rajat.gupta@oss.qualcomm.com \
--cc=rollkingzzc@gmail.com \
--cc=yimingqian591@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox