All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] strcmp: fix overflow error
@ 2009-11-17 16:51 Uwe Kleine-König
  2009-11-17 17:36 ` Andreas Schwab
  2009-11-17 17:41 ` Linus Torvalds
  0 siblings, 2 replies; 11+ messages in thread
From: Uwe Kleine-König @ 2009-11-17 16:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Michael Buesch, Peter Zijlstra, Andrew Morton, Linus Torvalds

strcmp("\x01", "\xef") returns 18 but it should return something < 0.
The reason is that the variable holding the result of the subtraction is
too small and overflows.

As strcmp is e.g. used to access data in squashfs this might result in
not finding files.

The same problem is fixed in strncmp.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
Hello,

I didn't hit this problem in the wild, only when checking for something
else.  Is this stable material anyhow?

Best regards
Uwe

 lib/string.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index b19b87a..661ff06 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -246,7 +246,7 @@ EXPORT_SYMBOL(strlcat);
 #undef strcmp
 int strcmp(const char *cs, const char *ct)
 {
-	signed char __res;
+	int __res;
 
 	while (1) {
 		if ((__res = *cs - *ct++) != 0 || !*cs++)
@@ -266,7 +266,7 @@ EXPORT_SYMBOL(strcmp);
  */
 int strncmp(const char *cs, const char *ct, size_t count)
 {
-	signed char __res = 0;
+	int __res = 0;
 
 	while (count) {
 		if ((__res = *cs - *ct++) != 0 || !*cs++)
-- 
1.6.5.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] strcmp: fix overflow error
  2009-11-17 16:51 [PATCH] strcmp: fix overflow error Uwe Kleine-König
@ 2009-11-17 17:36 ` Andreas Schwab
  2009-11-17 17:41 ` Linus Torvalds
  1 sibling, 0 replies; 11+ messages in thread
From: Andreas Schwab @ 2009-11-17 17:36 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: linux-kernel, Michael Buesch, Peter Zijlstra, Andrew Morton,
	Linus Torvalds

Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes:

> strcmp("\x01", "\xef") returns 18 but it should return something < 0.
> The reason is that the variable holding the result of the subtraction is
> too small and overflows.

When char is signed strcmp("\xef", "\x01") still returns something < 0.
You need to cast to unsigned char first.

Andreas.

-- 
Andreas Schwab, schwab@redhat.com
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
"And now for something completely different."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] strcmp: fix overflow error
  2009-11-17 16:51 [PATCH] strcmp: fix overflow error Uwe Kleine-König
  2009-11-17 17:36 ` Andreas Schwab
@ 2009-11-17 17:41 ` Linus Torvalds
  2009-11-17 18:16   ` Michael Buesch
                     ` (2 more replies)
  1 sibling, 3 replies; 11+ messages in thread
From: Linus Torvalds @ 2009-11-17 17:41 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: linux-kernel, Michael Buesch, Peter Zijlstra, Andrew Morton



On Tue, 17 Nov 2009, Uwe Kleine-König wrote:
>
> strcmp("\x01", "\xef") returns 18 but it should return something < 0.
> The reason is that the variable holding the result of the subtraction is
> too small and overflows.

No. The reason is that whoever wrote that function is a moron and doesn't 
know the standard. And your fix is not correct _either_

The comparison should be done as *unsigned char*. As specified by POSIX

  "The sign of a non-zero return value shall be determined by the sign of 
   the difference between the values of the first pair of bytes (both 
   interpreted as type unsigned char) that differ in the strings being 
   compared."

and both the original code and your change gets it wrong in different 
ways.

>  int strcmp(const char *cs, const char *ct)
>  {
> -	signed char __res;
> +	int __res;
>  
>  	while (1) {
>  		if ((__res = *cs - *ct++) != 0 || !*cs++)

So this is fundamentally incorrect both with "signed char __res" _and_ 
with "int __res", because '*cs' and '*ct' are both (possibly - it depends 
on the compiler and architecture) signed chars.

So in the case you mention, strcmp() _should_ return a negative value, 
because "\x01" is smaller than "\xef", but you have:

 - *cs = 1, *ct = (char) 0xef = -17 _OR_ 239 depending on sign of 'char'

and as a result:

 - signed char __res = 18 (incorrect, regardless: ct is larger)

 - int __res = 18 (incorrect) or -238 (correct) depending on sign of char

so your patch doesn't actually help at all.

What would help is something like the appended, but I have not tested it 
AT ALL. It may be total and utter crap too. Maybe it doesn't compile, 
maybe it buggers your pet hedgehog. I just don't know.

		Linus

---
 lib/string.c |   20 ++++++++++++++------
 1 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index b19b87a..e96421a 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -246,13 +246,17 @@ EXPORT_SYMBOL(strlcat);
 #undef strcmp
 int strcmp(const char *cs, const char *ct)
 {
-	signed char __res;
+	unsigned char c1, c2;
 
 	while (1) {
-		if ((__res = *cs - *ct++) != 0 || !*cs++)
+		c1 = *cs++;
+		c2 = *ct++;
+		if (c1 != c2)
+			return c1 < c2 ? -1 : 1;
+		if (!c1)
 			break;
 	}
-	return __res;
+	return 0;
 }
 EXPORT_SYMBOL(strcmp);
 #endif
@@ -266,14 +270,18 @@ EXPORT_SYMBOL(strcmp);
  */
 int strncmp(const char *cs, const char *ct, size_t count)
 {
-	signed char __res = 0;
+	unsigned char c1, c2;
 
 	while (count) {
-		if ((__res = *cs - *ct++) != 0 || !*cs++)
+		c1 = *cs++;
+		c2 = *ct++;
+		if (c1 != c2)
+			return c1 < c2 ? -1 : 1;
+		if (!c1)
 			break;
 		count--;
 	}
-	return __res;
+	return 0;
 }
 EXPORT_SYMBOL(strncmp);
 #endif

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] strcmp: fix overflow error
  2009-11-17 17:41 ` Linus Torvalds
@ 2009-11-17 18:16   ` Michael Buesch
  2009-11-17 18:40     ` Linus Torvalds
  2009-11-17 20:34     ` Andreas Schwab
  2009-11-17 18:55   ` Uwe Kleine-König
  2009-11-18 21:31   ` [PATCH] strcmp: fix overflow and possibly signedness error Uwe Kleine-König
  2 siblings, 2 replies; 11+ messages in thread
From: Michael Buesch @ 2009-11-17 18:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Uwe Kleine-König, linux-kernel, Peter Zijlstra,
	Andrew Morton

On Tuesday 17 November 2009 18:41:58 Linus Torvalds wrote:
> 
> On Tue, 17 Nov 2009, Uwe Kleine-König wrote:
> >
> > strcmp("\x01", "\xef") returns 18 but it should return something < 0.
> > The reason is that the variable holding the result of the subtraction is
> > too small and overflows.
> 
> No. The reason is that whoever wrote that function is a moron and doesn't 
> know the standard. And your fix is not correct _either_
> 
> The comparison should be done as *unsigned char*. As specified by POSIX
> 
>   "The sign of a non-zero return value shall be determined by the sign of 
>    the difference between the values of the first pair of bytes (both 
>    interpreted as type unsigned char) that differ in the strings being 
>    compared."
> 
> and both the original code and your change gets it wrong in different 
> ways.
> 
> >  int strcmp(const char *cs, const char *ct)
> >  {
> > -	signed char __res;
> > +	int __res;
> >  
> >  	while (1) {
> >  		if ((__res = *cs - *ct++) != 0 || !*cs++)
> 
> So this is fundamentally incorrect both with "signed char __res" _and_ 
> with "int __res", because '*cs' and '*ct' are both (possibly - it depends 
> on the compiler and architecture) signed chars.
> 
> So in the case you mention, strcmp() _should_ return a negative value, 
> because "\x01" is smaller than "\xef", but you have:
> 
>  - *cs = 1, *ct = (char) 0xef = -17 _OR_ 239 depending on sign of 'char'
> 
> and as a result:
> 
>  - signed char __res = 18 (incorrect, regardless: ct is larger)
> 
>  - int __res = 18 (incorrect) or -238 (correct) depending on sign of char
> 
> so your patch doesn't actually help at all.
> 
> What would help is something like the appended, but I have not tested it 
> AT ALL. It may be total and utter crap too. Maybe it doesn't compile, 
> maybe it buggers your pet hedgehog. I just don't know.
> 
> 		Linus
> 
> ---
>  lib/string.c |   20 ++++++++++++++------
>  1 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/string.c b/lib/string.c
> index b19b87a..e96421a 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -246,13 +246,17 @@ EXPORT_SYMBOL(strlcat);
>  #undef strcmp
>  int strcmp(const char *cs, const char *ct)
>  {
> -	signed char __res;
> +	unsigned char c1, c2;
>  
>  	while (1) {
> -		if ((__res = *cs - *ct++) != 0 || !*cs++)
> +		c1 = *cs++;
> +		c2 = *ct++;
> +		if (c1 != c2)
> +			return c1 < c2 ? -1 : 1;
> +		if (!c1)
>  			break;
>  	}
> -	return __res;
> +	return 0;
>  }

Well, that doesn't actually return the difference at all. Is that allowed?

What about this? (Manually hacked pseudo patch)

diff --git a/lib/string.c b/lib/string.c
index b19b87a..661ff06 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -246,7 +246,7 @@ EXPORT_SYMBOL(strlcat);
 #undef strcmp
 int strcmp(const char *cs, const char *ct)
 {
-       signed char __res;
+       int __res;
 
        while (1) {
-                if ((__res = *cs - *ct++) != 0 || !*cs++)
+                if ((__res = (unsigned char)(*cs) - (unsigned char)(*ct++)) != 0 || !*cs++)
@@ -266,7 +266,7 @@ EXPORT_SYMBOL(strcmp);
  */

-- 
Greetings, Michael.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] strcmp: fix overflow error
  2009-11-17 18:16   ` Michael Buesch
@ 2009-11-17 18:40     ` Linus Torvalds
  2009-11-17 20:34     ` Andreas Schwab
  1 sibling, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2009-11-17 18:40 UTC (permalink / raw)
  To: Michael Buesch
  Cc: Uwe Kleine-König, linux-kernel, Peter Zijlstra,
	Andrew Morton



On Tue, 17 Nov 2009, Michael Buesch wrote:
> 
> Well, that doesn't actually return the difference at all. Is that allowed?

It's in fact the common implementation. Returning -1/0/1 is kind of 
polite, since it means that the sign now fits in a minimal type (ie you 
can save the end result in a "signed char" and it will _work_, even 
though it's not guaranteed by the standard.

Returning any negative or positive number is certainly _allowed_ by the 
standard, but I just checked, and glibc does the "polite" -1/0/1 thing. I 
suspect many other libraries do too, and it's not like it costs you 
anything more.

In fact, the written-out-with-unsigned-char-variables version is also 
likely to generate better code than the "clever" one that does just one 
subtract, because now the code can do all the comparisons in just 
'unsigned char' and never needs to sign-extend the result to 'int'.

Not that it likely matters.

I double-checked, and the code generated from my patch looks sane.

	strcmp:
	        pushq   %rbp    #
	        movq    %rsp, %rbp      #,
	.L52:
	        movb    (%rdi), %al     #* cs, c1
	        movb    (%rsi), %dl     #* ct, c2
	        incq    %rdi    # cs
	        incq    %rsi    # ct
	        cmpb    %dl, %al        # c2, c1
	        je      .L49    #,
	        sbbl    %eax, %eax      # D.13150
	        orl     $1, %eax        #, D.13150
	        jmp     .L51    #
	.L49:
	        testb   %al, %al        # c1   
	        jne     .L52    #,
	        xorl    %eax, %eax      # D.13150
	.L51:
	        leave
	        ret

which is not horrible (of course, depending on whether you expect to find 
differences early or late you might want to have make the "L49" case be 
the fallthrough etc).

[ Using sbb+or is a standard x86 trick to get -1/1. You'll also find 
  "sbb+and" to get 0/value, or "sbb+add" to get value/value+1 ]

		Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] strcmp: fix overflow error
  2009-11-17 17:41 ` Linus Torvalds
  2009-11-17 18:16   ` Michael Buesch
@ 2009-11-17 18:55   ` Uwe Kleine-König
  2009-11-17 19:02     ` Linus Torvalds
  2009-11-18 21:31   ` [PATCH] strcmp: fix overflow and possibly signedness error Uwe Kleine-König
  2 siblings, 1 reply; 11+ messages in thread
From: Uwe Kleine-König @ 2009-11-17 18:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-kernel, Michael Buesch, Peter Zijlstra, Andrew Morton

Hello Linus,

On Tue, Nov 17, 2009 at 09:41:58AM -0800, Linus Torvalds wrote:
> On Tue, 17 Nov 2009, Uwe Kleine-König wrote:
> >
> > strcmp("\x01", "\xef") returns 18 but it should return something < 0.
> > The reason is that the variable holding the result of the subtraction is
> > too small and overflows.
> 
> No. The reason is that whoever wrote that function is a moron and doesn't 
> know the standard. And your fix is not correct _either_
OK, right.

Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

(BTW, this was already broken in 2.4.0, so I was unable to find out who
is the moron :-)

Best regards
Uwe

-- 
Pengutronix e.K.                              | Uwe Kleine-König            |
Industrial Linux Solutions                    | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] strcmp: fix overflow error
  2009-11-17 18:55   ` Uwe Kleine-König
@ 2009-11-17 19:02     ` Linus Torvalds
  2009-11-17 19:12       ` Linus Torvalds
  2009-11-17 19:19       ` Joe Perches
  0 siblings, 2 replies; 11+ messages in thread
From: Linus Torvalds @ 2009-11-17 19:02 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: linux-kernel, Michael Buesch, Peter Zijlstra, Andrew Morton



On Tue, 17 Nov 2009, Uwe Kleine-König wrote:
>
> (BTW, this was already broken in 2.4.0, so I was unable to find out who
> is the moron :-)

Yeah, I looked at the history too. It's so old that it might well be me.

			Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] strcmp: fix overflow error
  2009-11-17 19:02     ` Linus Torvalds
@ 2009-11-17 19:12       ` Linus Torvalds
  2009-11-17 19:19       ` Joe Perches
  1 sibling, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2009-11-17 19:12 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: linux-kernel, Michael Buesch, Peter Zijlstra, Andrew Morton



On Tue, 17 Nov 2009, Linus Torvalds wrote:
> 
> Yeah, I looked at the history too. It's so old that it might well be me.

In fact, it goes back to at least 1.2.13.

And the copyright dates do imply that they could go back way further. 

At least the comment says it all:

	"These are buggy as well.."

so I can at least claim that back in the _original_ 0.01 release, it was 
correct:

	extern inline int strcmp(const char * cs,const char * ct)
	{
	register int __res __asm__("ax");
	__asm__("cld\n"
	        "1:\tlodsb\n\t"
	        "scasb\n\t"
	        "jne 2f\n\t"
	        "testb %%al,%%al\n\t"
	        "jne 1b\n\t"
	        "xorl %%eax,%%eax\n\t"
	        "jmp 3f\n"
	        "2:\tmovl $1,%%eax\n\t"
	        "jl 3f\n\t"
	        "negl %%eax\n"
	        "3:"
	        :"=a" (__res):"D" (cs),"S" (ct):"si","di");
	return __res;
	}

and as far as I can tell, the above is actually correct, even if you have 
to be a bit crazy to write 'strcmp' as gcc inline asm (hey, I wrote _all_ 
the string routines that way, and I made gcc do some of them built-in. 
Because I was a MAN, dammit!).

So the bug was apparently introduced when we went portable.

			Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] strcmp: fix overflow error
  2009-11-17 19:02     ` Linus Torvalds
  2009-11-17 19:12       ` Linus Torvalds
@ 2009-11-17 19:19       ` Joe Perches
  1 sibling, 0 replies; 11+ messages in thread
From: Joe Perches @ 2009-11-17 19:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Uwe Kleine-König, linux-kernel, Michael Buesch,
	Peter Zijlstra, Andrew Morton


It's been like that for awhile.

http://www.linuxhq.com/kernel/v1.1/75/lib/string.c


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] strcmp: fix overflow error
  2009-11-17 18:16   ` Michael Buesch
  2009-11-17 18:40     ` Linus Torvalds
@ 2009-11-17 20:34     ` Andreas Schwab
  1 sibling, 0 replies; 11+ messages in thread
From: Andreas Schwab @ 2009-11-17 20:34 UTC (permalink / raw)
  To: Michael Buesch
  Cc: Linus Torvalds, Uwe Kleine-König, linux-kernel,
	Peter Zijlstra, Andrew Morton

Michael Buesch <mb@bu3sch.de> writes:

> Well, that doesn't actually return the difference at all. Is that allowed?

The only requirement is that the compare functions return <0,0,>0
depending on the order.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] strcmp: fix overflow and possibly signedness error
  2009-11-17 17:41 ` Linus Torvalds
  2009-11-17 18:16   ` Michael Buesch
  2009-11-17 18:55   ` Uwe Kleine-König
@ 2009-11-18 21:31   ` Uwe Kleine-König
  2 siblings, 0 replies; 11+ messages in thread
From: Uwe Kleine-König @ 2009-11-18 21:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Michael Buesch, Andreas Schwab, Andrew Morton

From: Linus Torvalds <torvalds@linux-foundation.org>

	signed char __res = *cs - *ct;

is wrong for two reasons.  The subtraction can overflow because __res
doesn't use a type big enough.  Moreover the compared bytes should be
interpreted as unsigned char as specified by POSIX.

The same problem is fixed in strncmp.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
Hello Linus,

I resend with a (hopefully) nice commit log as this didn't appear in
your tree up to now.

Best regards
Uwe

 lib/string.c |   20 ++++++++++++++------
 1 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index b19b87a..e96421a 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -246,13 +246,17 @@ EXPORT_SYMBOL(strlcat);
 #undef strcmp
 int strcmp(const char *cs, const char *ct)
 {
-	signed char __res;
+	unsigned char c1, c2;
 
 	while (1) {
-		if ((__res = *cs - *ct++) != 0 || !*cs++)
+		c1 = *cs++;
+		c2 = *ct++;
+		if (c1 != c2)
+			return c1 < c2 ? -1 : 1;
+		if (!c1)
 			break;
 	}
-	return __res;
+	return 0;
 }
 EXPORT_SYMBOL(strcmp);
 #endif
@@ -266,14 +270,18 @@ EXPORT_SYMBOL(strcmp);
  */
 int strncmp(const char *cs, const char *ct, size_t count)
 {
-	signed char __res = 0;
+	unsigned char c1, c2;
 
 	while (count) {
-		if ((__res = *cs - *ct++) != 0 || !*cs++)
+		c1 = *cs++;
+		c2 = *ct++;
+		if (c1 != c2)
+			return c1 < c2 ? -1 : 1;
+		if (!c1)
 			break;
 		count--;
 	}
-	return __res;
+	return 0;
 }
 EXPORT_SYMBOL(strncmp);
 #endif
-- 
1.6.5.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-11-18 21:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-17 16:51 [PATCH] strcmp: fix overflow error Uwe Kleine-König
2009-11-17 17:36 ` Andreas Schwab
2009-11-17 17:41 ` Linus Torvalds
2009-11-17 18:16   ` Michael Buesch
2009-11-17 18:40     ` Linus Torvalds
2009-11-17 20:34     ` Andreas Schwab
2009-11-17 18:55   ` Uwe Kleine-König
2009-11-17 19:02     ` Linus Torvalds
2009-11-17 19:12       ` Linus Torvalds
2009-11-17 19:19       ` Joe Perches
2009-11-18 21:31   ` [PATCH] strcmp: fix overflow and possibly signedness error Uwe Kleine-König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.