linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Signed divides vs shifts (Re: [Security] /dev/urandom uses uninit bytes, leaks user data)
       [not found] <E1J3RD5-0006yU-00@gondolin.me.apana.org.au>
@ 2007-12-17 17:28 ` Linus Torvalds
  2007-12-17 17:48   ` Al Viro
  2007-12-17 17:55   ` Eric Dumazet
  0 siblings, 2 replies; 9+ messages in thread
From: Linus Torvalds @ 2007-12-17 17:28 UTC (permalink / raw)
  To: Herbert Xu
  Cc: John Reiser, Andrew Morton, security, tytso,
	Linux Kernel Mailing List, mpm, linux-sparse



On Sat, 15 Dec 2007, Herbert Xu wrote:
> 
> There ought to be a warning about this sort of thing.

We could add it to sparse. The appended (untested) patch seems to say 
there's a lot of those signed divides-by-power-of-twos.

However, the problem with such warnings is that it encourages people to do 
the simple fix that may be *wrong*. For example, you fixed it with patches 
like

> -		int rsvd = r->limit ? 0 : random_read_wakeup_thresh/4;
> +		int rsvd = r->limit ? 0 : random_read_wakeup_thresh / 4u;

which is really quite dangerous for several reasons:

 - it depends intimately on the type of the thing being divided (try it: 
   it will do nothing at all if the thing you divide is larger than 
   "unsigned int", since then the "4u" will be turned into a _signed_ 
   larger type by the C type expansion).

   So in general, the above doesn't even do what it's supposed to do on a 
   64-bit architecture if the thing to be divided is 64-bit!

 - it changes behaviour. If that thing really is signed and can be 
   negative, that "trivial" patch just changed the divide to be 
   fundamentally something totally different.

so I think this patch is horribly wrong.

The *correct* way to fix signed divisions is by doing one of two things:

 - really make the data we divide be unsigned. With all the thinking that 
   involves!

   This is the good change, but it does involve making sure that there are 
   no tests against zero and that the value really cannot go negative. 
   Usually the unsigned types are (a) faster and (b) more robust, but if 
   somebody is depending on signs, unsigned types are obviously not 
   appropriate.

 - change a divide-by-power-of-2 into a signed shift instead.

   Yes, this also changes the end result for negative values, but it 
   changes it in a sane and generally good way (ie it will still be a 
   "valid" divide, it will just be a divide that rounds differently, and 
   is more likely than turning it into an unsigned divide to generally 
   result in working code).

Hmm?

			Linus
---
 simplify.c |   30 ++++++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/simplify.c b/simplify.c
index 94e14d2..91f1120 100644
--- a/simplify.c
+++ b/simplify.c
@@ -286,6 +286,36 @@ static int simplify_constant_rightside(struct instruction *insn)
 		if (!value)
 			return replace_with_pseudo(insn, insn->src2);
 		return 0;
+
+	case OP_DIVU: case OP_DIVS:
+		if (!value)
+			break;
+		if (value == 1)
+			return replace_with_pseudo(insn, insn->src1);
+		/* Power-of-two? */
+		if (!(value & (value-1))) {
+			int log2 = -1;
+			do {
+				log2++;
+				value >>= 1;
+			} while (value);
+
+			/* Turn unsigned divides into shifts */
+			if (insn->opcode == OP_DIVU) {
+				insn->src2->value = log2;
+				insn->opcode = OP_LSR;
+				return 1;
+			}
+
+			/*
+			 * This is incorrect, but we care more about
+			 * the warning than the code generation
+			 */
+			warning(insn->pos, "expensive signed divide");
+			insn->src2->value = log2;
+			insn->opcode = OP_ASR;
+			return 1;
+		}
 	}
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Signed divides vs shifts (Re: [Security] /dev/urandom uses uninit bytes, leaks user data)
  2007-12-17 17:28 ` Signed divides vs shifts (Re: [Security] /dev/urandom uses uninit bytes, leaks user data) Linus Torvalds
@ 2007-12-17 17:48   ` Al Viro
  2007-12-17 17:55   ` Eric Dumazet
  1 sibling, 0 replies; 9+ messages in thread
From: Al Viro @ 2007-12-17 17:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Herbert Xu, John Reiser, Andrew Morton, security, tytso,
	Linux Kernel Mailing List, mpm, linux-sparse

On Mon, Dec 17, 2007 at 09:28:57AM -0800, Linus Torvalds wrote:
> 
> 
> On Sat, 15 Dec 2007, Herbert Xu wrote:
> > 
> > There ought to be a warning about this sort of thing.
> 
> We could add it to sparse. The appended (untested) patch seems to say 
> there's a lot of those signed divides-by-power-of-twos.

I'm not sure that you are warning about the right things.  If you want
a real nightmare scenario in that area, consider this:

	int x[20];

	int *p = x + n;
	int *q = x + m;

	p - q
	((char *)p - (char *)q)/4
	((char *)p - (char *)q)/sizeof(int)

The first two are equivalent on all targets we care about.  However, an
attempt to make the second one "more portable" silently creates the
code that'll do something entirely different as soon as we get m > n...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Signed divides vs shifts (Re: [Security] /dev/urandom uses uninit bytes, leaks user data)
  2007-12-17 17:28 ` Signed divides vs shifts (Re: [Security] /dev/urandom uses uninit bytes, leaks user data) Linus Torvalds
  2007-12-17 17:48   ` Al Viro
@ 2007-12-17 17:55   ` Eric Dumazet
  2007-12-17 18:05     ` Ray Lee
                       ` (2 more replies)
  1 sibling, 3 replies; 9+ messages in thread
From: Eric Dumazet @ 2007-12-17 17:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Herbert Xu, John Reiser, Andrew Morton, security, tytso,
	Linux Kernel Mailing List, mpm, linux-sparse

On Mon, 17 Dec 2007 09:28:57 -0800 (PST)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Sat, 15 Dec 2007, Herbert Xu wrote:
> > 
> > There ought to be a warning about this sort of thing.
> 
> We could add it to sparse. The appended (untested) patch seems to say 
> there's a lot of those signed divides-by-power-of-twos.
> 
> However, the problem with such warnings is that it encourages people to do 
> the simple fix that may be *wrong*. For example, you fixed it with patches 
> like
> 
> > -		int rsvd = r->limit ? 0 : random_read_wakeup_thresh/4;
> > +		int rsvd = r->limit ? 0 : random_read_wakeup_thresh / 4u;
> 
> which is really quite dangerous for several reasons:
> 
>  - it depends intimately on the type of the thing being divided (try it: 
>    it will do nothing at all if the thing you divide is larger than 
>    "unsigned int", since then the "4u" will be turned into a _signed_ 
>    larger type by the C type expansion).
> 

I was looking at lib/extable.c which does emit a signed divide on i386 but not on x86_64:

mid = (last - first) / 2 + first;

So I tried to compiled this on x86_64 :

long *mid(long *a, long *b)
{
	return ((a - b) / 2 + a);
}

It gave :
mid:
        movq    %rdi, %rdx
        subq    %rsi, %rdx
        sarq    $3, %rdx
        movq    %rdx, %rax
        shrq    $63, %rax
        addq    %rdx, %rax
        sarq    %rax
        leaq    (%rdi,%rax,8), %rax
        ret

while 

long *mid(long *a, long *b)
{
	return ((a - b) / 2u + a);
}

gave :
mid:
        movq    %rdi, %rdx
        subq    %rsi, %rdx
        sarq    $3, %rdx
        movq    %rdx, %rax
        shrq    $63, %rax
        addq    %rdx, %rax
        sarq    %rax
        leaq    (%rdi,%rax,8), %rax
        ret

and while :

long *mid(long *a, long *b)
{
	return (((unsigned long)(a - b)) / 2 + a);
}

gave :
mid:
        movq    %rdi, %rax
        subq    %rsi, %rax
        sarq    %rax
        andq    $-8, %rax
        addq    %rdi, %rax
        ret


But I found this cast ugly so I cooked this patch.

[PATCH] Avoid signed arithmetics in search_extable()

On i386 and gcc-4.2.{1|2}, search_extable() currently does integer divides (by 2 !!!), while
we can certainly use a right shift. This looks more a typical bsearch() implementation.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

diff --git a/lib/extable.c b/lib/extable.c
index 463f456..03a81bd 100644
--- a/lib/extable.c
+++ b/lib/extable.c
@@ -54,20 +54,20 @@ search_extable(const struct exception_table_entry *first,
 	       const struct exception_table_entry *last,
 	       unsigned long value)
 {
-	while (first <= last) {
-		const struct exception_table_entry *mid;
+	unsigned long mid, low = 0, high = (last - first);
 
-		mid = (last - first) / 2 + first;
+	while (low <= high) {
+		mid = (low + high) / 2;
 		/*
 		 * careful, the distance between entries can be
-		 * larger than 2GB:
+		 * larger than MAX_LONG:
 		 */
-		if (mid->insn < value)
-			first = mid + 1;
-		else if (mid->insn > value)
-			last = mid - 1;
+		if (first[mid].insn < value)
+			low = mid + 1;
+		else if (first[mid].insn > value)
+			high = mid - 1;
 		else
-			return mid;
+			return first + mid;
         }
         return NULL;
 }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Signed divides vs shifts (Re: [Security] /dev/urandom uses uninit bytes, leaks user data)
  2007-12-17 17:55   ` Eric Dumazet
@ 2007-12-17 18:05     ` Ray Lee
  2007-12-17 18:10       ` Eric Dumazet
  2007-12-17 18:23     ` Al Viro
  2007-12-17 18:28     ` [Security] Signed divides vs shifts (Re: " Linus Torvalds
  2 siblings, 1 reply; 9+ messages in thread
From: Ray Lee @ 2007-12-17 18:05 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linus Torvalds, Herbert Xu, John Reiser, Andrew Morton, security,
	tytso, Linux Kernel Mailing List, mpm, linux-sparse

On Dec 17, 2007 9:55 AM, Eric Dumazet <dada1@cosmosbay.com> wrote:
> -               mid = (last - first) / 2 + first;
> +       while (low <= high) {
> +               mid = (low + high) / 2;

I think you just introduced a bug. Think about what happens if
low=high=MAX_LONG/2 + 1.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Signed divides vs shifts (Re: [Security] /dev/urandom uses uninit bytes, leaks user data)
  2007-12-17 18:05     ` Ray Lee
@ 2007-12-17 18:10       ` Eric Dumazet
  2007-12-17 18:12         ` Ray Lee
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2007-12-17 18:10 UTC (permalink / raw)
  To: Ray Lee
  Cc: Linus Torvalds, Herbert Xu, John Reiser, Andrew Morton, security,
	tytso, Linux Kernel Mailing List, mpm, linux-sparse

On Mon, 17 Dec 2007 10:05:35 -0800
"Ray Lee" <ray-lk@madrabbit.org> wrote:

> On Dec 17, 2007 9:55 AM, Eric Dumazet <dada1@cosmosbay.com> wrote:
> > -               mid = (last - first) / 2 + first;
> > +       while (low <= high) {
> > +               mid = (low + high) / 2;
> 
> I think you just introduced a bug. Think about what happens if
> low=high=MAX_LONG/2 + 1.
> 

Fortunatly this is not possible :)

Hint : sizeof(struct exception_table_entry) is >= 8

so high is garanteed to be <= MAX_LONG/8

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Signed divides vs shifts (Re: [Security] /dev/urandom uses uninit bytes, leaks user data)
  2007-12-17 18:10       ` Eric Dumazet
@ 2007-12-17 18:12         ` Ray Lee
  0 siblings, 0 replies; 9+ messages in thread
From: Ray Lee @ 2007-12-17 18:12 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linus Torvalds, Herbert Xu, John Reiser, Andrew Morton, security,
	tytso, Linux Kernel Mailing List, mpm, linux-sparse

On Dec 17, 2007 10:10 AM, Eric Dumazet <dada1@cosmosbay.com> wrote:
> On Mon, 17 Dec 2007 10:05:35 -0800
> "Ray Lee" <ray-lk@madrabbit.org> wrote:
>
> > On Dec 17, 2007 9:55 AM, Eric Dumazet <dada1@cosmosbay.com> wrote:
> > > -               mid = (last - first) / 2 + first;
> > > +       while (low <= high) {
> > > +               mid = (low + high) / 2;
> >
> > I think you just introduced a bug. Think about what happens if
> > low=high=MAX_LONG/2 + 1.
> >
>
> Fortunatly this is not possible :)
>
> Hint : sizeof(struct exception_table_entry) is >= 8
>
> so high is garanteed to be <= MAX_LONG/8

Ah, my bad. One of these days I'll learn to not post before coffee.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Signed divides vs shifts (Re: [Security] /dev/urandom uses uninit bytes, leaks user data)
  2007-12-17 17:55   ` Eric Dumazet
  2007-12-17 18:05     ` Ray Lee
@ 2007-12-17 18:23     ` Al Viro
  2007-12-17 18:28     ` [Security] Signed divides vs shifts (Re: " Linus Torvalds
  2 siblings, 0 replies; 9+ messages in thread
From: Al Viro @ 2007-12-17 18:23 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linus Torvalds, Herbert Xu, John Reiser, Andrew Morton, security,
	tytso, Linux Kernel Mailing List, mpm, linux-sparse

On Mon, Dec 17, 2007 at 06:55:57PM +0100, Eric Dumazet wrote:

> long *mid(long *a, long *b)
> {
> 	return ((a - b) / 2 + a);
> }

... is not actually a middle (you'd want b-a, not a-b there), but anyway

> It gave :
> mid:
>         movq    %rdi, %rdx
>         subq    %rsi, %rdx
>         sarq    $3, %rdx
>         movq    %rdx, %rax
>         shrq    $63, %rax
>         addq    %rdx, %rax
>         sarq    %rax
>         leaq    (%rdi,%rax,8), %rax
>         ret
> 
> while 
> 
> long *mid(long *a, long *b)
> {
> 	return ((a - b) / 2u + a);
> }

... undefined behaviour if a < b

> and while :
> 
> long *mid(long *a, long *b)
> {
> 	return (((unsigned long)(a - b)) / 2 + a);
> }

undefined behaviour, again.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Security] Signed divides vs shifts (Re: /dev/urandom uses uninit bytes, leaks user data)
  2007-12-17 17:55   ` Eric Dumazet
  2007-12-17 18:05     ` Ray Lee
  2007-12-17 18:23     ` Al Viro
@ 2007-12-17 18:28     ` Linus Torvalds
  2007-12-17 19:08       ` Al Viro
  2 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2007-12-17 18:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: security, tytso, Herbert Xu, John Reiser,
	Linux Kernel Mailing List, linux-sparse, mpm, Andrew Morton



On Mon, 17 Dec 2007, Eric Dumazet wrote:
> 
> while 
> 
> long *mid(long *a, long *b)
> {
> 	return ((a - b) / 2u + a);
> }

This is exactly what I'm talking about. That "2u" is TOTALLY POINTLESS. 
It's an "unsigned int", but since (a-b) will be of type ptrdiff_t, and is 
*wider* on a 64-bit architecture (it's the same as "long" on x86-64), then 
the 2u will just be converted to "long", and be signed again! 

So you thought that you did an unsigned divide, but you did no such thing. 

If you change the "2u" to a "2ul", it works again, and you get

	mid:
	        movq    %rdi, %rax
	        subq    %rsi, %rax
	        sarq    %rax
	        andq    $-8, %rax
	        addq    %rdi, %rax
	        ret

which is the code you wanted. But quite frankly, you could just have 
written it with a shift to start with, and avoided the subtle type issue, 
although gcc then generates

        movq    %rdi, %rax
        subq    %rsi, %rax
        sarq    $4, %rax
        leaq    (%rdi,%rax,8), %rax
        ret

instead. Of course, this all *does* still have subtle sign issues, because 
the "a-b" part implies a signed divide in itself, which is why you see 
that "sarq" in he first place (rather than a "shrq"). 

Signed divides are hard. The "a-b" pointer subtraction is actually cheaper 
than a general signed divide by sizeof, since the compiler can then assume 
that the two pointers are mutually aligned, which is why gcc can generate 
just a single "sarq" instead of having to do an extra "add negative bit" 
thing to get the rounding right.

[ So Al, when you said that

	(a-b)

  is equivalent to

	((char *)a-(char *)b)/4

  for a "int *" a and b, you're right in the sense that the *result* is 
  the same, but the code generation likely isn't. The "a-b" thing can (and 
  does) allow the compiler to avoid the whole "align up for signed 
  numbers" thing, and the difference in code generation is clear:

	subq    %rsi, %rdi
	sarq    $2, %rdi

  vs

	subq    %rsi, %rdi
	leaq    3(%rdi), %rax
	testq   %rdi, %rdi
	cmovs   %rax, %rdi
	sarq    $2, %rdi

  exactly because the first case *knows* that the low two bits have to be 
  zero, and thus there is no rounding issue. ]

			Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Security] Signed divides vs shifts (Re: /dev/urandom uses uninit bytes, leaks user data)
  2007-12-17 18:28     ` [Security] Signed divides vs shifts (Re: " Linus Torvalds
@ 2007-12-17 19:08       ` Al Viro
  0 siblings, 0 replies; 9+ messages in thread
From: Al Viro @ 2007-12-17 19:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric Dumazet, security, tytso, Herbert Xu, John Reiser,
	Linux Kernel Mailing List, linux-sparse, mpm, Andrew Morton

On Mon, Dec 17, 2007 at 10:28:38AM -0800, Linus Torvalds wrote:
> [ So Al, when you said that
> 
> 	(a-b)
> 
>   is equivalent to
> 
> 	((char *)a-(char *)b)/4
> 
>   for a "int *" a and b, you're right in the sense that the *result* is 
>   the same, but the code generation likely isn't. The "a-b" thing can (and 

Sure.  And yes, I very much do prefer code that uses C as it ought to be
used and doesn't play games with casts from hell, etc.  For a lot of reasons,
both correctness- and efficiency-related.

We _do_ have such turds.  In spades.  And such places are potential timebombs,
since well-intentioned idiotic patch ("I've read in lecture notes that sizeof
is better than explicit constant, so replacement surely can only improve the
things and the best part is, I don't need to understand what I'm doing")
turns an ugly FPOS into equally ugly FPOS that silently doesn't work ;-/

[sorry about the rant, I'm about 3/4 through the drivers/net colonoscopy,
with >300Kb of patches and a pile of assorted bugs so far - and then there's
drivers/scsi to deal with.  Endianness stuff, mostly...]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-12-17 19:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <E1J3RD5-0006yU-00@gondolin.me.apana.org.au>
2007-12-17 17:28 ` Signed divides vs shifts (Re: [Security] /dev/urandom uses uninit bytes, leaks user data) Linus Torvalds
2007-12-17 17:48   ` Al Viro
2007-12-17 17:55   ` Eric Dumazet
2007-12-17 18:05     ` Ray Lee
2007-12-17 18:10       ` Eric Dumazet
2007-12-17 18:12         ` Ray Lee
2007-12-17 18:23     ` Al Viro
2007-12-17 18:28     ` [Security] Signed divides vs shifts (Re: " Linus Torvalds
2007-12-17 19:08       ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).