Netdev List
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: David Laight <david.laight.linux@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
	netdev@vger.kernel.org, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, horms@kernel.org,
	jiri@resnulli.us, victor@mojatatu.com, yimingqian591@gmail.com,
	keenanat2000@gmail.com, 2045gemini@gmail.com,
	rollkingzzc@gmail.com, dcaratti@redhat.com, security@kernel.org,
	linux-kernel@vger.kernel.org,
	Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Subject: Re: [PATCH net v2 1/1] net/sched: fix pedit partial COW leading to page cache corruption
Date: Wed, 27 May 2026 12:21:31 +0200	[thread overview]
Message-ID: <87tsrt2lw4.fsf@toke.dk> (raw)
In-Reply-To: <20260527100055.1661c5c9@pumpkin>

David Laight <david.laight.linux@gmail.com> writes:

> On Tue, 26 May 2026 21:22:52 +0200
> Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
>> Jamal Hadi Salim <jhs@mojatatu.com> writes:
>> 
>> > From: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
>> >
>> > tcf_pedit_act() computes the COW range for skb_ensure_writable()
>> > once before the key loop using tcfp_off_max_hint, but the hint does
>> > not account for the runtime header offset added by typed keys. This
>> > can leave part of the write region un-COW'd.
>> >
>> > Fix by moving skb_ensure_writable() inside the per-key loop where
>> > the actual write offset is known, and add overflow checking on the
>> > offset arithmetic. For negative offsets (e.g. Ethernet header edits
>> > at ingress), use skb_cow() to COW the headroom instead. Guard
>> > offset_valid() against INT_MIN, where negation is undefined.
>> >
>> > Additionally, linearize skbs with shared frags upfront to prevent
>> > silent data corruption when pedit operates on zero-copy pages
>> > (e.g. from sendfile).
>> >
>> > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
>> > Reported-by: Yiming Qian <yimingqian591@gmail.com>
>> > Reported-by: Keenan Dong <keenanat2000@gmail.com>
>> > Reported-by: Han Guidong <2045gemini@gmail.com>
>> > Reported-by: Zhang Cen <rollkingzzc@gmail.com>
>> > Tested-by: Victor Nogueira <victor@mojatatu.com>
>> > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
>> > Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>  
>> 
>> Re-ran the tests, and everything looks good, so:
>> 
>> Tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> 
>> Also looked at the code, and I have a few nits below, but I'm really
>> nitpicking here, so whether you end up fixing those or not:
>> 
>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> 
>> 
>> [...]
>> 
>> > @@ -323,7 +324,7 @@ static bool offset_valid(struct sk_buff *skb, int offset)
>> >  	if (offset > 0 && offset > skb->len)
>> >  		return false;
>> >  
>> > -	if  (offset < 0 && -offset > skb_headroom(skb))
>> > +	if (offset < 0 && offset < -(int)skb_headroom(skb))
>> >  		return false;  
>> 
>> This change makes it really obvious that this is really just:
>> 
>> 	if (offset < -(int)skb_headroom(skb))
>>   		return false;
>> 
>> so, well, that would be clearer, IMO.
>> 
>> But then I guess the same could be said of the positive case, so:
>> 
>> static bool offset_valid(struct sk_buff *skb, int offset)
>> {
>> 	if (offset > skb->len || offset < -(int)skb_headroom(skb))
>> 		return false;
>> 
>> 	return true;
>> }
>
> There are all sorts of integer conversions going on.
> IIRC Both skb->len and skb_headroom() are 32bit unsigned.
> skb_headroom() is relatively small, skb->len can be over 64k but nowhere
> near MAX_INT.

Right, I had the implicit signed/unsigned conversions the wrong way
'round in my head. So what's missing above is a cast of skb->len to int,
i.e.:

 	if (offset > (int)skb->len || offset < -(int)skb_headroom(skb))
 		return false;

right?

> offset is signed 32bit and the code is allowing for it being -MAX_INT
> (but I'm not at all sure whether that can happen without overflow being likely).
>
> So I think the single test:
> 	if (offset + skb_headroom(skb) >= skb->len + skb_headroom(skb))
> 		return false;
> is correct.
> If offset is 'too negative' the LHS will be 'very large postitive' and
> the test fails.

Yeah, I agree. FWIW, they turn out to be the same number of
instructions, the difference being that one contains a jump and the
other doesn't. It seems clang needs an unlikely() to put that jump in
the 'false' path, though:

https://godbolt.org/z/5o5Gxqe3W

OTOH, I think the two tests are more readable; the single-test version
would need a comment explaining your "too negative" rationale. But given
the subtleties of the implicit integer conversions, maybe that's better
anyway.

And they're both half the instructions of the version currently in the
patch, so if we're code golfing (which I guess we are at this point) I
guess we should pick one of those :)

-Toke


  reply	other threads:[~2026-05-27 10:21 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-26 15:59 [PATCH net v2 1/1] net/sched: fix pedit partial COW leading to page cache corruption Jamal Hadi Salim
2026-05-26 19:22 ` Toke Høiland-Jørgensen
2026-05-27  9:00   ` David Laight
2026-05-27 10:21     ` Toke Høiland-Jørgensen [this message]
2026-05-27 14:56   ` Jamal Hadi Salim
2026-05-26 21:29 ` Davide Caratti
2026-05-27  2:36 ` Han Guidong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tsrt2lw4.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=2045gemini@gmail.com \
    --cc=davem@davemloft.net \
    --cc=david.laight.linux@gmail.com \
    --cc=dcaratti@redhat.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=keenanat2000@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rajat.gupta@oss.qualcomm.com \
    --cc=rollkingzzc@gmail.com \
    --cc=security@kernel.org \
    --cc=victor@mojatatu.com \
    --cc=yimingqian591@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox