Re: [PATCH] RDMA/ocrdma: Fix an off-by-one issue in 'ocrdma_add_stat'

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@ziepe.ca>
To: Dan Carpenter <dan.carpenter@oracle.com>
Cc: g@ziepe.ca, Christophe JAILLET <christophe.jaillet@wanadoo.fr>,
	selvin.xavier@broadcom.com, devesh.sharma@broadcom.com,
	dledford@redhat.com, leon@kernel.org, colin.king@canonical.com,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-janitors@vger.kernel.org
Subject: Re: [PATCH] RDMA/ocrdma: Fix an off-by-one issue in 'ocrdma_add_stat'
Date: Fri, 17 Apr 2020 15:56:18 +0000	[thread overview]
Message-ID: <20200417155618.GG26002@ziepe.ca> (raw)
In-Reply-To: <20200417151314.GV1163@kadam>

On Fri, Apr 17, 2020 at 06:13:14PM +0300, Dan Carpenter wrote:
> I was just meant unsigned iterators, not sizes.  I consider that to be a
> different sort of bug.  The original code did this:
> 
> 	desc_size = max_t(int, 32, desc_size);
> 
> Using signed casts for min_t() always seems like a crazy thing to me.  I
> have a static checker warning for those but I think people didn't accept
> my patches for those if it was only for kernel hardenning and
> readability instead of to fix bugs.  I don't know why, maybe casting to
> an int is faster?

Casting usually doesn't cost anything... But I think this shows again
why int is trouble: most likely desc_size is a unsigned value stored
in an int, and the temptation of max_t is to use the type of the
output variable.  So 'int' is a logical, if not bonkers choice.

If desc_size was properly unsigned then the author should keep using
unsigned through the max_t()

> > > You would need to hit a series of fairly rare events for this
> > > warning to be useful and I have never seen that happen yet.
> > 
> > IIRC the case was the uapi rightly used u32, which was then wrongly
> > implicitly cast to some internal function,  accepting int, which then
> > did something sort of like
> > 
> >   int len
> >   if (len >= sizeof(a))
> >        return -EINVAL
> >   copy_from_user(a, b, len)
> 
> This code works.  "len" is type promoted to unsigned and negative values
> are rejected.

Expecting people to know the unsigned/sign implicit promotion rules
for expressions to determine the code is right/wrong is just asking to
much, IMHO. I certainly don't have them all memorized and try to avoid
them :(

Using int pretty much guarentees you hit those cases...

> The real life example was slightly more complicated than that.  But the
> point is that a lot of people think unsigned values are inherently more
> safe and they use u32 everywhere as a default datatype.  I argue that
> the default should always be int unless there is a good reason
> otherwise.

Why? In my experience most values actually are never negative.

> to save memory.  There are reasons for the other datatypes to exist, but
> using them is tricky so it's best to avoid it if you can.

I can't say I have the same experience..

> There is a lot of magic to making your limits unsigned long type.

Oh? More magic than int is implictly promoted to unsigned anyhow?

> > > Originally if user_value was an int then the loop would have been a
> > > harmless no-op but now it was a large positive value so it lead to
> > > memory corruption.  Another example is:
> > > 
> > > 	for (i = 0; i < user_value - 1; i++) {
> > 
> > Again, code like this is simply missing required input validation. The
> > for loop works with int by dumb luck, and this would be broken if it
> > called copy_from_user.
> 
> The thing about int type is that it works like people expect normal
> numbers to work.

Not really. Some cases are a bit better, some are worse. As above when
using int:

 -1 >= sizeof(A) = true

Which is not at all how any sane person thinks about normal
numbers. There are lots of these odd traps around implicit promotion.

While foo-1 is right there, explicitly. If foo-1 < 0 and the code is
not expecting it then you have a clear problem.

> People normally think that zero minus one is going to
> be negative but if they change to u32 by default then it wraps to
> UINT_MAX and that's unexpected.  

Maybe I've been doing this too long, but this seems totally expected
to me...

> There is an element where the static checker encourages people to
> "change your types to match" and that's garbage advice.  Changing
> your types doesn't magically make things better and I would argue
> that it normally makes things worse.

I don't know much about this warning, but I do find that people
starting out tend to just use 'int' everywhere and 'hope for the best'
without any clear understanding or thinking of what types are what.

> > If you get the in habit of using types properly then it is less likely
> > this bug-class will happen. If your habit is to just always use 'int'
> > for everything then you *will* accidently cause a user value to be
> > implicitly casted.
> 
> This is an interesting theory but I haven't seen any evidence to support
> it.  My intuition is that it's better to only care when you have to
> otherwise you get overwhelmed.

If you make everything unsigned, there really is no downside, other
than possible subtraction related issues that exist anyhow, AFAIK.

This is the approach the C std uses, pretty much the entire API is
properly marked with signed/unsigned. I feel in pretty good company
advocating that this is the best way to write C code :)

But then again, I find the modular math intuitive and aware to be
careful with subtraction...

Jason

WARNING: multiple messages have this Message-ID (diff)

From: Jason Gunthorpe <jgg@ziepe.ca>
To: Dan Carpenter <dan.carpenter@oracle.com>
Cc: g@ziepe.ca, Christophe JAILLET <christophe.jaillet@wanadoo.fr>,
	selvin.xavier@broadcom.com, devesh.sharma@broadcom.com,
	dledford@redhat.com, leon@kernel.org, colin.king@canonical.com,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-janitors@vger.kernel.org
Subject: Re: [PATCH] RDMA/ocrdma: Fix an off-by-one issue in 'ocrdma_add_stat'
Date: Fri, 17 Apr 2020 12:56:18 -0300	[thread overview]
Message-ID: <20200417155618.GG26002@ziepe.ca> (raw)
In-Reply-To: <20200417151314.GV1163@kadam>

On Fri, Apr 17, 2020 at 06:13:14PM +0300, Dan Carpenter wrote:
> I was just meant unsigned iterators, not sizes.  I consider that to be a
> different sort of bug.  The original code did this:
> 
> 	desc_size = max_t(int, 32, desc_size);
> 
> Using signed casts for min_t() always seems like a crazy thing to me.  I
> have a static checker warning for those but I think people didn't accept
> my patches for those if it was only for kernel hardenning and
> readability instead of to fix bugs.  I don't know why, maybe casting to
> an int is faster?

Casting usually doesn't cost anything... But I think this shows again
why int is trouble: most likely desc_size is a unsigned value stored
in an int, and the temptation of max_t is to use the type of the
output variable.  So 'int' is a logical, if not bonkers choice.

If desc_size was properly unsigned then the author should keep using
unsigned through the max_t()

> > > You would need to hit a series of fairly rare events for this
> > > warning to be useful and I have never seen that happen yet.
> > 
> > IIRC the case was the uapi rightly used u32, which was then wrongly
> > implicitly cast to some internal function,  accepting int, which then
> > did something sort of like
> > 
> >   int len
> >   if (len >= sizeof(a))
> >        return -EINVAL
> >   copy_from_user(a, b, len)
> 
> This code works.  "len" is type promoted to unsigned and negative values
> are rejected.

Expecting people to know the unsigned/sign implicit promotion rules
for expressions to determine the code is right/wrong is just asking to
much, IMHO. I certainly don't have them all memorized and try to avoid
them :(

Using int pretty much guarentees you hit those cases...

> The real life example was slightly more complicated than that.  But the
> point is that a lot of people think unsigned values are inherently more
> safe and they use u32 everywhere as a default datatype.  I argue that
> the default should always be int unless there is a good reason
> otherwise.

Why? In my experience most values actually are never negative.

> to save memory.  There are reasons for the other datatypes to exist, but
> using them is tricky so it's best to avoid it if you can.

I can't say I have the same experience..

> There is a lot of magic to making your limits unsigned long type.

Oh? More magic than int is implictly promoted to unsigned anyhow?

> > > Originally if user_value was an int then the loop would have been a
> > > harmless no-op but now it was a large positive value so it lead to
> > > memory corruption.  Another example is:
> > > 
> > > 	for (i = 0; i < user_value - 1; i++) {
> > 
> > Again, code like this is simply missing required input validation. The
> > for loop works with int by dumb luck, and this would be broken if it
> > called copy_from_user.
> 
> The thing about int type is that it works like people expect normal
> numbers to work.

Not really. Some cases are a bit better, some are worse. As above when
using int:

 -1 >= sizeof(A) == true

Which is not at all how any sane person thinks about normal
numbers. There are lots of these odd traps around implicit promotion.

While foo-1 is right there, explicitly. If foo-1 < 0 and the code is
not expecting it then you have a clear problem.

> People normally think that zero minus one is going to
> be negative but if they change to u32 by default then it wraps to
> UINT_MAX and that's unexpected.  

Maybe I've been doing this too long, but this seems totally expected
to me...

> There is an element where the static checker encourages people to
> "change your types to match" and that's garbage advice.  Changing
> your types doesn't magically make things better and I would argue
> that it normally makes things worse.

I don't know much about this warning, but I do find that people
starting out tend to just use 'int' everywhere and 'hope for the best'
without any clear understanding or thinking of what types are what.

> > If you get the in habit of using types properly then it is less likely
> > this bug-class will happen. If your habit is to just always use 'int'
> > for everything then you *will* accidently cause a user value to be
> > implicitly casted.
> 
> This is an interesting theory but I haven't seen any evidence to support
> it.  My intuition is that it's better to only care when you have to
> otherwise you get overwhelmed.

If you make everything unsigned, there really is no downside, other
than possible subtraction related issues that exist anyhow, AFAIK.

This is the approach the C std uses, pretty much the entire API is
properly marked with signed/unsigned. I feel in pretty good company
advocating that this is the best way to write C code :)

But then again, I find the modular math intuitive and aware to be
careful with subtraction...

Jason

next prev parent reply	other threads:[~2020-04-17 15:56 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-28  7:30 [PATCH] RDMA/ocrdma: Fix an off-by-one issue in 'ocrdma_add_stat' Christophe JAILLET
2020-03-28  7:30 ` Christophe JAILLET
2020-04-14 18:34 ` Jason Gunthorpe
2020-04-14 18:34   ` Jason Gunthorpe
2020-04-16 13:08   ` Dan Carpenter
2020-04-16 13:08     ` Dan Carpenter
2020-04-16 18:47     ` Jason Gunthorpe
2020-04-16 18:47       ` Jason Gunthorpe
2020-04-17 11:26       ` Dan Carpenter
2020-04-17 11:26         ` Dan Carpenter
2020-04-17 12:25         ` Jason Gunthorpe
2020-04-17 12:25           ` Jason Gunthorpe
2020-04-17 13:09           ` Dan Carpenter
2020-04-17 13:09             ` Dan Carpenter
2020-04-17 13:48             ` Jason Gunthorpe
2020-04-17 13:48               ` Jason Gunthorpe
2020-04-17 15:13               ` Dan Carpenter
2020-04-17 15:13                 ` Dan Carpenter
2020-04-17 15:56                 ` Jason Gunthorpe [this message]
2020-04-17 15:56                   ` Jason Gunthorpe
2020-04-17 13:28   ` Marion & Christophe JAILLET
2020-04-17 13:28     ` Marion & Christophe JAILLET
2020-04-17 13:50     ` Jason Gunthorpe
2020-04-17 13:50       ` Jason Gunthorpe
2020-04-17 14:39       ` Christophe JAILLET
2020-04-17 14:39         ` Christophe JAILLET

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200417155618.GG26002@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=christophe.jaillet@wanadoo.fr \
    --cc=colin.king@canonical.com \
    --cc=dan.carpenter@oracle.com \
    --cc=devesh.sharma@broadcom.com \
    --cc=dledford@redhat.com \
    --cc=g@ziepe.ca \
    --cc=kernel-janitors@vger.kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=selvin.xavier@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.