Netdev List
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Rajat Gupta <rajat.gupta@oss.qualcomm.com>,
	netdev@vger.kernel.org, davem@davemloft.net, edumazet@google.com,
	pabeni@redhat.com, horms@kernel.org, jiri@resnulli.us,
	yimingqian591@gmail.com, keenanat2000@gmail.com,
	2045gemini@gmail.com, rollkingzzc@gmail.com
Subject: Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
Date: Mon, 25 May 2026 10:34:43 -0700	[thread overview]
Message-ID: <20260525103443.1da3e406@kernel.org> (raw)
In-Reply-To: <CAM0EoMmjg+qrc=E+1nSVcCUPFh9yUfw59ASLCOM4jcEvRtw4jg@mail.gmail.com>

On Mon, 25 May 2026 12:22:40 -0400 Jamal Hadi Salim wrote:
> On Mon, May 25, 2026 at 11:39 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > So as an alternative to the piece i posted? i.e this:
> > >
> > > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> > > index 79921b8d89ba..8f0f84b50c85 100644
> > > --- a/net/sched/act_pedit.c
> > > +++ b/net/sched/act_pedit.c
> > > @@ -474,6 +474,12 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> > >                 if (write_offset < 0) {
> > >                         if (skb_cow(skb, -write_offset))
> > >                                 goto bad;
> > > +                       if (write_offset + (int)sizeof(hdata) > 0) {
> > > +                               if (skb_ensure_writable(skb,
> > > +                                               min_t(int, skb->len,
> > > +                                                     write_offset + (int)sizeof(hdata))))
> > > +                                       goto bad;
> > > +                       }
> > >                 } else {
> > >                         if (unlikely(check_add_overflow(write_offset,
> > >                                                         (int)sizeof(hdata),  
> >
> > Yup! Even better.  
>
> Dude, it's hard to follow you sometimes ;-> It's hard to grok what you
> mean the problem of "we are writing to frags".

Long threads have the tendency of losing focus.
Better to just repost the whole diff than tossing snippets at some point.. ;)

> Let me try to be verbose and you can narrow it down. There are _two_
> codelets dealing with "frags" both of which can be written to.
> 
> 1) The patch from Rajat deals with zero copy from app with shared
> flags. That's whats being exploited in the wild.
> There's a piece of code there that does this in Rajat's patch to handle it:
> 
> +       /*
> +        * If the skb has shared frags the user is likely using zero-copy
> +        * (e.g. sendfile).  Those page frags may point to page-cache pages;
> +        * writing into them would silently corrupt the page cache.
> +        * Linearize so pedit operates on a private copy.
> +        * TL;DR: if you want zero-copy, don't use pedit.
> +        */
> +       if (skb_has_shared_frag(skb)) {
> +               if (__skb_linearize(skb))
> +                       goto bad;
> +       }
> 
> After you posted, i thought this is the "we are writing to frags"
> issue you were referring to.
> I if-zeroed that code and indeed it does not seem that the exploit can
> be executed even with that taken out.

I don't want the skb_has_shared_frag() being added, that's my ask.
TBH I missed that skb_ensure_writable() when reading the patch.

If we use skb_ensure_writable() before all the writes we can delete
the skb_has_shared_frag() (as your test proves). This is the extent
to which I care about the patch, anything else - no strong opinion :)

> 2) There's the frags coming from the other (driver) direction in
> particular when skbs get cloned. That is what sashiko2 (nipa variant)
> pointed out. IOW, there's no active report for that specific one but
> Han Guidong responded after i posted this:
> 
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -474,6 +474,12 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>                 if (write_offset < 0) {
>                         if (skb_cow(skb, -write_offset))
>                                 goto bad;
> +                       if (write_offset + (int)sizeof(hdata) > 0) {
> +                               if (skb_ensure_writable(skb,
> +                                               min_t(int, skb->len,
> +                                                     write_offset + (int)sizeof(hdata))))
> +                                       goto bad;
> +                       }
>                 } else {
>                         if (unlikely(check_add_overflow(write_offset,
>                                                         (int)sizeof(hdata),
> 
> Saying he was able to recreate that scenario with a kernel module. And
> that this patchlet fixed it.
> 
> Hope you are still following at this point ;->
> So when you said we can use skb pulls - I thought you were referring
> to removing the above patchlet and instead to use an skb pull approach
> (for which you posted a sample patch).

Yes, skb_ensure_writable() does a pull already. So the only problem
with existing patch was that the negative offset branch was missing 
a skb_ensure_writable(). Your snippet added it, plugging that hole, 
so skb_has_shared_frag() can now be 100% safely removed. Hence my
"LGTM".

Please also remove the skb_store_bits(), and skb_header_pointer().
Unless I'm missing something these are dead code.

> I mentioned the two issues:
> 1) It will likely break the negative offset that work with pedit
> already (skb pull could conceivably be tricked to assume a large
> positive number)

Your snippet looked fine tho unnecessarily complex in practice.
(AI generated?). I'd go with skb_ensure_writable(sizeof(*ptr))
as the worst case. Packets shorter than 4B are irrelevant in practice.
But again, up to you.

> 2) that skb clones could result in writting into the shared data
> (which i said i may be overthinking).
> 
> So which one of the two are you referring to? Or maybe it is both.
> Should we keep #1? or this the one that should be replaced?
> Are you ok with patchlet for number #2? Or do you want that replaced?
> 
> Provide as much context as you can so we dont go back and forth ;->

  reply	other threads:[~2026-05-25 17:34 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-18  1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta
2026-05-18 13:10 ` Han Guidong
2026-05-18 13:31 ` Jamal Hadi Salim
2026-05-19  3:39   ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
2026-05-19 11:18     ` Toke Høiland-Jørgensen
2026-05-19 15:10     ` Han Guidong
2026-05-20  9:12       ` Jamal Hadi Salim
2026-05-20 10:04         ` Han Guidong
2026-05-20 10:36         ` Han Guidong
2026-05-20 11:40           ` Jamal Hadi Salim
2026-05-20  9:23     ` Jamal Hadi Salim
2026-05-20 20:00       ` Jamal Hadi Salim
2026-05-21  9:53         ` Jamal Hadi Salim
2026-05-21 10:15           ` Jamal Hadi Salim
2026-05-21 14:35             ` Jakub Kicinski
2026-05-21 15:16               ` Jamal Hadi Salim
2026-05-21 15:46                 ` Jakub Kicinski
2026-05-22 11:47                   ` Jamal Hadi Salim
2026-05-22 15:46                     ` Jakub Kicinski
2026-05-22 16:37                       ` Jamal Hadi Salim
2026-05-22 17:01                         ` Jamal Hadi Salim
2026-05-23  0:55                           ` Jakub Kicinski
2026-05-23 12:07                             ` Jamal Hadi Salim
2026-05-23 12:13                               ` Jamal Hadi Salim
2026-05-23 16:46                                 ` Jakub Kicinski
2026-05-23 16:57                                   ` Jamal Hadi Salim
2026-05-25 15:39                                     ` Jakub Kicinski
2026-05-25 16:22                                       ` Jamal Hadi Salim
2026-05-25 17:34                                         ` Jakub Kicinski [this message]
2026-05-25 19:03                                           ` Jamal Hadi Salim
     [not found]               ` <CAKa-r6soz=iMBiYG0Grhhc12yhdw9vMNV+XjjEPCmtgKK6+rhA@mail.gmail.com>
2026-05-21 15:56                 ` Jakub Kicinski
2026-05-22 11:49                 ` Jamal Hadi Salim
2026-05-22 12:00                   ` Toke Høiland-Jørgensen
2026-05-22 14:49                   ` Davide Caratti
2026-05-22  7:49             ` Han Guidong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260525103443.1da3e406@kernel.org \
    --to=kuba@kernel.org \
    --cc=2045gemini@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=keenanat2000@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rajat.gupta@oss.qualcomm.com \
    --cc=rollkingzzc@gmail.com \
    --cc=yimingqian591@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox