RE: [PATCH next v3 0/5] minmax: Relax type checks in min() and max().

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: David Laight <David.Laight@ACULAB.COM>
To: 'Linus Torvalds' <torvalds@linux-foundation.org>,
	Kees Cook <keescook@chromium.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Christoph Hellwig <hch@infradead.org>,
	"Jason A. Donenfeld" <Jason@zx2c4.com>
Subject: RE: [PATCH next v3 0/5] minmax: Relax type checks in min() and max().
Date: Wed, 23 Aug 2023 08:42:24 +0000	[thread overview]
Message-ID: <10b50bca86004232b28edf0143ce87fc@AcuMS.aculab.com> (raw)
In-Reply-To: <CAHk-=whwEAc22wm8h9FESPB5X+P4bLDgv0erBQMa1buTNQW7tA@mail.gmail.com>

From: Linus Torvalds
> Sent: Tuesday, August 22, 2023 6:36 PM
> 
> On Mon, 21 Aug 2023 at 11:24, Kees Cook <keescook@chromium.org> wrote:
> >
> > It seems like the foot-gun problems are when a value gets clamped by the
> > imposed type. Can't we just warn about those cases?
> 
> I think that the problem with min_t() is that it is used as a random
> "when min() warns about different types", and that it basically just
> adds a cast without any semantic meaning.
> 
> End result: we currently have 4500+ of those cases (and another 1300
> uses of 'max_t') and I bet that several of them are the narrowing
> kind. And some are probably valid.
> 
> And if we tighten up "min_t()" type rules, we'll just hit the *next*
> problem in the series, namely "what are people going to do now?"
> 
> We don't want to just keep pushing the problem down.
> 
> So I actually mostly liked all the *other* patches in David's series:
> using 'min_unsigned()' and friends adds a nice *semantic* layer, not
> just a cast. And relaxing the checking of min/max doesn't cause the
> same the "push problems down" issue, as long as the relaxing is
> reasonable.
> 
> (Side note: I'm not convinced 'min_unsigned()' is the right syntax.
> While I like the concept, I think 'min()' is often used as a part of
> other expressions, and 'min_unsigned()' ends up making for a very
> illegible long and complex thing. I think we might as well use
> 'umin()/umax()', matching our type system).

Picking a name is like painting the bike-shed :-)

> It's just that I very much don't think it's reasonable to relax "20u"
> (or - more commonly - sizeof) to be any random constant signed
> integer, and it should *not* compare well with signed integers and not
> silently become a signed compare.
> 
> (But I do think that it's very ok to go the other way: compare a
> _unsigned_ value with a "signed" constant integer like 20. The two
> cases are not symmetrical: '20' can be a perfectly fine unsigned
> value, but '20u' cannot be treated signed).

I'll rebase the patch after the next -rc1 is out with that removed. 

> And while I don't like David's patch to silently turn unsigned
> constant signed, I do acknowledge that very often the *source* of the
> unsignedness is a 'sizeof()' expression, and then you have an integer
> that gets compared to a size, and you end up using 'min_t()'. But I do
> *NOT* want to fix those cases by ignoring the signedness.
> 
> Just a quick grep of
> 
>     git grep 'min_t(size_t' | wc
> 
> shows that quite a lot of the 'min_t()' cases are this exact issue.
> But I absolutely do *not* think the solution is to relax 'min()'.

size_t is actually a problem with unsigned constants.
It is 'unsigned int' on 32bit and 'unsigned long' on 64bit.
Making it hard to use a literal.

> I suspect the fix to those cases is to much more eagerly use
> 'clamp()'. Almost every single time you do a "compare to a size", it
> really is "make sure the integer is within the size range". So while
> 
>     int val
>    ...
>     x = min(val,sizeof(xyz));
> 
> is horrendously wrong and *should* warn,

The difficulty is getting people to correct it by changing
the type on 'val' to be unsigned.
Sometimes it is just a local variable.
Often it just can't ever be negative, or there is an immediately
preceding check.

Indeed if it is negative that is probably a bug.
(eg a missing error check from an earlier call.)
No amount of compile-time checking is going to help.
About the only thing that would help would be a run-time
check and a 'goto label' if negative.

There are plenty of places where a (short) buffer length is
passed to a function as 'int' - even though negative values
are completely invalid.
In reality the type change needs 'chasing back' through to
all the callers.

> I think doing
> 
>    x = clamp(val, 0, sizeof(xyz));
> 
> is a *much* nicer model, and should not warn even if "val" and the
> upper bound do not agree. In the above kind of situation, suddenly it
> *is* ok to treat the 'sizeof()' as a signed integer, but only because
> we have that explicit lower bound too.

That would require major rework on clamp() :-)

It would also be nice to get clamp(unsigned_var, 0u, 20u) to
compile without the annoying warning from the compiler.
I think you have to use:
	builtin_choose_expr(x, var, 0) >= builtin_choose_expr(x, low, 0)

> In other words: we should not "try to fix the types". That was what
> caused the problem in the first place, with "min_t()" just trying to
> fix the type mismatch with a cast. Casts are wrong, and we should have
> known that. The end result is horrendous, and I do agree with David on
> that too.

Some of the worst casts are the ones that are only there for sparse.
The compiler sees '(force __be32)var' as '(uint)var' rather than 'var'.
That really ought to always have been 'force(__be32, var)' so that
it can be made transparent to the compiler.

> I think we should strive to fix it with "semantic" fixes instead. Like
> the above "use clamp() instead of min(), and the fundamental
> signedness problem simply goes away because it has enough semantic
> meaning to be well-defined".

Unfortunately it adds an extra compare and branch to get mis-predicted.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

next prev parent reply	other threads:[~2023-08-23  8:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-04 10:50 [PATCH next v3 0/5] minmax: Relax type checks in min() and max() David Laight
2023-08-04 10:53 ` [PATCH v3 1/5] minmax: Add min_unsigned(a, b) and max_unsigned(a, b) David Laight
2023-08-04 10:54 ` [PATCH v3 2/5] minmax: Allow min()/max()/clamp() if the arguments have the same signedness David Laight
2023-08-04 10:55 ` [PATCH v3 3/5] minmax: Fix indentation of __cmp_once() and __clamp_once() David Laight
2023-08-04 10:55 ` [PATCH v3 4/5] minmax: Allow comparisons of 'int' against 'unsigned char/short' David Laight
2023-08-04 10:56 ` [PATCH v3 5/5] minmax: Relax check to allow comparison between int and small unsigned constants David Laight
2023-08-04 18:14   ` Linus Torvalds
2023-08-07 10:50     ` David Laight
2023-08-07 15:48       ` Linus Torvalds
2023-08-10  8:29         ` David Laight
2023-08-10 19:46           ` Linus Torvalds
2023-08-14  8:04             ` David Laight
2023-08-14 14:51             ` David Laight
2023-08-14 15:29               ` David Laight
2023-08-14 21:21 ` [PATCH next v3 0/5] minmax: Relax type checks in min() and max() Kees Cook
2023-08-15  8:55   ` David Laight
2023-08-21 18:24     ` Kees Cook
2023-08-22 17:35       ` Linus Torvalds
2023-08-23  8:42         ` David Laight [this message]
2023-08-23  8:52       ` David Laight
2023-08-23 15:32         ` Linus Torvalds
2023-08-24  9:05           ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=10b50bca86004232b28edf0143ce87fc@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=Jason@zx2c4.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=hch@infradead.org \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox