All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Rajat Gupta <rajat.gupta@oss.qualcomm.com>,
	netdev@vger.kernel.org, davem@davemloft.net, edumazet@google.com,
	pabeni@redhat.com, horms@kernel.org, jiri@resnulli.us,
	yimingqian591@gmail.com, keenanat2000@gmail.com,
	2045gemini@gmail.com, rollkingzzc@gmail.com
Subject: Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
Date: Mon, 25 May 2026 10:34:43 -0700	[thread overview]
Message-ID: <20260525103443.1da3e406@kernel.org> (raw)
In-Reply-To: <CAM0EoMmjg+qrc=E+1nSVcCUPFh9yUfw59ASLCOM4jcEvRtw4jg@mail.gmail.com>

On Mon, 25 May 2026 12:22:40 -0400 Jamal Hadi Salim wrote:
> On Mon, May 25, 2026 at 11:39 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > So as an alternative to the piece i posted? i.e this:
> > >
> > > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> > > index 79921b8d89ba..8f0f84b50c85 100644
> > > --- a/net/sched/act_pedit.c
> > > +++ b/net/sched/act_pedit.c
> > > @@ -474,6 +474,12 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> > >                 if (write_offset < 0) {
> > >                         if (skb_cow(skb, -write_offset))
> > >                                 goto bad;
> > > +                       if (write_offset + (int)sizeof(hdata) > 0) {
> > > +                               if (skb_ensure_writable(skb,
> > > +                                               min_t(int, skb->len,
> > > +                                                     write_offset + (int)sizeof(hdata))))
> > > +                                       goto bad;
> > > +                       }
> > >                 } else {
> > >                         if (unlikely(check_add_overflow(write_offset,
> > >                                                         (int)sizeof(hdata),  
> >
> > Yup! Even better.  
>
> Dude, it's hard to follow you sometimes ;-> It's hard to grok what you
> mean the problem of "we are writing to frags".

Long threads have the tendency of losing focus.
Better to just repost the whole diff than tossing snippets at some point.. ;)

> Let me try to be verbose and you can narrow it down. There are _two_
> codelets dealing with "frags" both of which can be written to.
> 
> 1) The patch from Rajat deals with zero copy from app with shared
> flags. That's whats being exploited in the wild.
> There's a piece of code there that does this in Rajat's patch to handle it:
> 
> +       /*
> +        * If the skb has shared frags the user is likely using zero-copy
> +        * (e.g. sendfile).  Those page frags may point to page-cache pages;
> +        * writing into them would silently corrupt the page cache.
> +        * Linearize so pedit operates on a private copy.
> +        * TL;DR: if you want zero-copy, don't use pedit.
> +        */
> +       if (skb_has_shared_frag(skb)) {
> +               if (__skb_linearize(skb))
> +                       goto bad;
> +       }
> 
> After you posted, i thought this is the "we are writing to frags"
> issue you were referring to.
> I if-zeroed that code and indeed it does not seem that the exploit can
> be executed even with that taken out.

I don't want the skb_has_shared_frag() being added, that's my ask.
TBH I missed that skb_ensure_writable() when reading the patch.

If we use skb_ensure_writable() before all the writes we can delete
the skb_has_shared_frag() (as your test proves). This is the extent
to which I care about the patch, anything else - no strong opinion :)

> 2) There's the frags coming from the other (driver) direction in
> particular when skbs get cloned. That is what sashiko2 (nipa variant)
> pointed out. IOW, there's no active report for that specific one but
> Han Guidong responded after i posted this:
> 
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -474,6 +474,12 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>                 if (write_offset < 0) {
>                         if (skb_cow(skb, -write_offset))
>                                 goto bad;
> +                       if (write_offset + (int)sizeof(hdata) > 0) {
> +                               if (skb_ensure_writable(skb,
> +                                               min_t(int, skb->len,
> +                                                     write_offset + (int)sizeof(hdata))))
> +                                       goto bad;
> +                       }
>                 } else {
>                         if (unlikely(check_add_overflow(write_offset,
>                                                         (int)sizeof(hdata),
> 
> Saying he was able to recreate that scenario with a kernel module. And
> that this patchlet fixed it.
> 
> Hope you are still following at this point ;->
> So when you said we can use skb pulls - I thought you were referring
> to removing the above patchlet and instead to use an skb pull approach
> (for which you posted a sample patch).

Yes, skb_ensure_writable() does a pull already. So the only problem
with existing patch was that the negative offset branch was missing 
a skb_ensure_writable(). Your snippet added it, plugging that hole, 
so skb_has_shared_frag() can now be 100% safely removed. Hence my
"LGTM".

Please also remove the skb_store_bits(), and skb_header_pointer().
Unless I'm missing something these are dead code.

> I mentioned the two issues:
> 1) It will likely break the negative offset that work with pedit
> already (skb pull could conceivably be tricked to assume a large
> positive number)

Your snippet looked fine tho unnecessarily complex in practice.
(AI generated?). I'd go with skb_ensure_writable(sizeof(*ptr))
as the worst case. Packets shorter than 4B are irrelevant in practice.
But again, up to you.

> 2) that skb clones could result in writting into the shared data
> (which i said i may be overthinking).
> 
> So which one of the two are you referring to? Or maybe it is both.
> Should we keep #1? or this the one that should be replaced?
> Are you ok with patchlet for number #2? Or do you want that replaced?
> 
> Provide as much context as you can so we dont go back and forth ;->

  reply	other threads:[~2026-05-25 17:34 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-18  1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta
2026-05-18 13:10 ` Han Guidong
2026-05-18 13:31 ` Jamal Hadi Salim
2026-05-19  3:39   ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
2026-05-19 11:18     ` Toke Høiland-Jørgensen
2026-05-19 15:10     ` Han Guidong
2026-05-20  9:12       ` Jamal Hadi Salim
2026-05-20 10:04         ` Han Guidong
2026-05-20 10:36         ` Han Guidong
2026-05-20 11:40           ` Jamal Hadi Salim
2026-05-20  9:23     ` Jamal Hadi Salim
2026-05-20 20:00       ` Jamal Hadi Salim
2026-05-21  9:53         ` Jamal Hadi Salim
2026-05-21 10:15           ` Jamal Hadi Salim
2026-05-21 14:35             ` Jakub Kicinski
2026-05-21 15:16               ` Jamal Hadi Salim
2026-05-21 15:46                 ` Jakub Kicinski
2026-05-22 11:47                   ` Jamal Hadi Salim
2026-05-22 15:46                     ` Jakub Kicinski
2026-05-22 16:37                       ` Jamal Hadi Salim
2026-05-22 17:01                         ` Jamal Hadi Salim
2026-05-23  0:55                           ` Jakub Kicinski
2026-05-23 12:07                             ` Jamal Hadi Salim
2026-05-23 12:13                               ` Jamal Hadi Salim
2026-05-23 16:46                                 ` Jakub Kicinski
2026-05-23 16:57                                   ` Jamal Hadi Salim
2026-05-25 15:39                                     ` Jakub Kicinski
2026-05-25 16:22                                       ` Jamal Hadi Salim
2026-05-25 17:34                                         ` Jakub Kicinski [this message]
2026-05-25 19:03                                           ` Jamal Hadi Salim
2026-05-26  2:06                                             ` Rajat Gupta
2026-05-26  9:48                                     ` David Laight
2026-05-26 11:57                                       ` Jamal Hadi Salim
2026-05-26 13:08                                         ` David Laight
2026-05-26 14:22                                           ` Jamal Hadi Salim
     [not found]               ` <CAKa-r6soz=iMBiYG0Grhhc12yhdw9vMNV+XjjEPCmtgKK6+rhA@mail.gmail.com>
2026-05-21 15:56                 ` Jakub Kicinski
2026-05-22 11:49                 ` Jamal Hadi Salim
2026-05-22 12:00                   ` Toke Høiland-Jørgensen
2026-05-22 14:49                   ` Davide Caratti
2026-05-22  7:49             ` Han Guidong
2026-05-26  9:53     ` David Laight
2026-05-26 12:01       ` Jamal Hadi Salim
2026-05-26 12:47         ` David Laight
2026-05-26 12:48           ` Jamal Hadi Salim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260525103443.1da3e406@kernel.org \
    --to=kuba@kernel.org \
    --cc=2045gemini@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=keenanat2000@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rajat.gupta@oss.qualcomm.com \
    --cc=rollkingzzc@gmail.com \
    --cc=yimingqian591@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.