public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* RE: SSE related security hole
@ 2002-04-22 22:24 Saxena, Sunil
  0 siblings, 0 replies; 61+ messages in thread
From: Saxena, Sunil @ 2002-04-22 22:24 UTC (permalink / raw)
  To: 'rhkernel-dept-list@redhat.com'
  Cc: linux-kernel, jakub, aj, ak, pavel

Hi Everyone,

Sorry it took us some time to respond to this issue.

The Intel prescribed method for correct execution of SSE/SSE2 instructions
requires that software detect the support for these instructions via the
CPUID feature flags bits, prior to executing the instruction (see
<<ftp://download.intel.com/design/perftool/cbts/appnotes/ap900/cpuosid.pdf>>
). Also, Section 2.2 of the IA-32 Intel(R) Architecture Software Developer's
Manual, Vol 2, indicates that the use of the operand size prefix with
MMX/SSE/SSE2 instructions is reserved and may cause unpredictable behavior.


We recognized that there is a discrepancy in the individual instruction
descriptions in Vol 2 where it is indicated that the instruction would
generate a UD#. We will be rectifying this discrepancy in the next revision
of Vol 2 as well as via the monthly Specification Updates.

Thanks
Sunil


-----Original Message-----
From: Doug Ledford [mailto:dledford@redhat.com] 
Sent: Wednesday, April 17, 2002 4:56 PM
To: sunil.saxena@intel.com
Cc: linux-kernel@vger.kernel.org; jakub@redhat.com; aj@suse.de; ak@suse.de;
pavel@atrey.karlin.mff.cuni.cz
Subject: Re: SSE related security hole


Subject: Re: SSE related security hole
Reply-To: 
In-Reply-To: <200204171513.g3HFD3n04056@fenrus.demon.nl>; from
arjan@fenrus.demon.nl on Wed, Apr 17, 2002 at 04:13:03PM +0100

(NOTE: This may have already been answered by someone else, but I haven't 
seen it if it has, so I'm sending this through)

> Hi,
> while debugging GCC bugreport, I noticed following behaviour of simple
> program with no syscalls:
> 
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 28
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 56
> Bad sum (seen with gcc -O -march=pentiumpro -msse)
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 84
> Bad sum (seen with gcc -O -march=pentiumpro -msse)
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 112
> Bad sum (seen with gcc -O -march=pentiumpro -msse)
> hubicka@nikam:~$ echo
> 
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 28
> 
> 
> ie it always returns different value, moreover when something else
> is run in meantime (verified by loading WWW page served by same machine),
> the counter is reinitialized to 28.
> 
> I am attaching the source, but it needs to be compiled by cfg-branch GCC
> with settings -O2 -march=pentium3 -mfpmath=sse, so I've placed static
> binary to http://atrey.karlin.mff.cuni.cz/~hubicka/badsum.bin

Compiling the asm source with a different compiler will also make it fail.

> The problem appears to be reproducible only on pentium3 and athlon4
systems,
> not pentium4 system, where it appears to work as expected.  Reproduced on
> both 2.4.9-RH and 2.4.16 kernels.

[ program snipped ]

So there are two different issues at play here.  First, the kernel uses
the fninit instruction to initialize the fpu on first use.  Nothing in the
Intel manuals says anything about the fninit instruction clearing the mmx
or sse registers, and experimentally we now know for sure that it doesn't.  
That means that when the first time your program was ran it left 28 in
register xmm1.  The next time the program was run, the fninit did nothing
to clear register xmm1 so it still held 28.  Now, the pxor instruction
that is part of the m() function and intended to 0 out the xmm1 register
is an sse2 instruction.  It just so happens that it doesn't work on sse
only processors such as P3 CPUs.  So, when 28 was left in xmm1, then the
pxor failed to 0 out xmm1, we saved 28 as the starting value for the loop
and then looped through 7 additions until we had, you guessed it, 56.  In
fact, if you do a while :;do bad; done loop the it will increment by 28
each time it is run except when something else intervenes.  Replacing the
pxor instruction with xorps instead makes it work.  So, that's a bug in
gcc I suspect, using sse2 instructions when only called to use sse
instructions.  It seems odd to me that the CPU wouldn't generate an
illegal instruction exception, but oh well, it evidently doesn't.

So, we really should change arch/i386/kernel/i387.c something like this:

(WARNING, totally untested and not even compile checked change follows)

--- i387.c.save	Wed Apr 17 19:22:47 2002
+++ i387.c	Wed Apr 17 19:28:27 2002
@@ -33,8 +33,26 @@
 void init_fpu(void)
 {
 	__asm__("fninit");
-	if ( cpu_has_xmm )
+	if ( cpu_has_mmx )
+		asm volatile("xorq %%mm0, %%mm0;
+			      xorq %%mm1, %%mm1;
+			      xorq %%mm2, %%mm2;
+			      xorq %%mm3, %%mm3;
+			      xorq %%mm4, %%mm4;
+			      xorq %%mm5, %%mm5;
+			      xorq %%mm6, %%mm6;
+			      xorq %%mm7, %%mm7");
+	if ( cpu_has_xmm ) {
+		asm volatile("xorps %%xmm0, %%xmm0;
+			      xorps %%xmm1, %%xmm1;
+			      xorps %%xmm2, %%xmm2;
+			      xorps %%xmm3, %%xmm3;
+			      xorps %%xmm4, %%xmm4;
+			      xorps %%xmm5, %%xmm5;
+			      xorps %%xmm6, %%xmm6;
+			      xorps %%xmm7, %%xmm7");
 		load_mxcsr(0x1f80);
+	}
 		
 	current->used_math = 1;
 }


The rest of the problem is a gcc bug and possibly something that Intel 
should make a note of on the p3 processors (that is, that the p3 will 
silently fail to execute some sse2 instructions without generating the 
expected exception).

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606
  



_______________________________________________
rhkernel-dept-list mailing list
rhkernel-dept-list@redhat.com
http://post-office.corp.redhat.com/mailman/listinfo/rhkernel-dept-list

^ permalink raw reply	[flat|nested] 61+ messages in thread
[parent not found: <200204182320.53095.nahshon@actcom.co.il>]
* Re: SSE related security hole
@ 2002-04-18 18:36 linux
  2002-04-18 18:53 ` Richard B. Johnson
  2002-04-18 21:06 ` H. Peter Anvin
  0 siblings, 2 replies; 61+ messages in thread
From: linux @ 2002-04-18 18:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux

Um, people here seem to be assuming that, in the absence of MMX,
fninit *doesn't* leak information.

I thought it was well-known to just clear (set to all-ones) the
tag register and not alter the actual floating-point registers.

Thus, it seems quite feasible to reset the tag word with FLDENV and
store out the FPU registers, even on an 80387.

Isn't this the same security hole?  Shouldn't there be 8 FLDZ instructions
(or equivalent) in the processor state initialization?

^ permalink raw reply	[flat|nested] 61+ messages in thread
* Re: SSE related security hole
@ 2002-04-17 23:42 Doug Ledford
  2002-04-18  5:26 ` Andrea Arcangeli
  2002-04-18  8:22 ` Andi Kleen
  0 siblings, 2 replies; 61+ messages in thread
From: Doug Ledford @ 2002-04-17 23:42 UTC (permalink / raw)
  To: jh; +Cc: linux-kernel, jakub, aj, ak, pavel

Subject: Re: SSE related security hole
Reply-To: 
In-Reply-To: <200204171513.g3HFD3n04056@fenrus.demon.nl>; from arjan@fenrus.demon.nl on Wed, Apr 17, 2002 at 04:13:03PM +0100

(NOTE: This may have already been answered by someone else, but I haven't 
seen it if it has, so I'm sending this through)

> Hi,
> while debugging GCC bugreport, I noticed following behaviour of simple
> program with no syscalls:
> 
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 28
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 56
> Bad sum (seen with gcc -O -march=pentiumpro -msse)
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 84
> Bad sum (seen with gcc -O -march=pentiumpro -msse)
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 112
> Bad sum (seen with gcc -O -march=pentiumpro -msse)
> hubicka@nikam:~$ echo
> 
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 28
> 
> 
> ie it always returns different value, moreover when something else
> is run in meantime (verified by loading WWW page served by same machine),
> the counter is reinitialized to 28.
> 
> I am attaching the source, but it needs to be compiled by cfg-branch GCC
> with settings -O2 -march=pentium3 -mfpmath=sse, so I've placed static
> binary to http://atrey.karlin.mff.cuni.cz/~hubicka/badsum.bin

Compiling the asm source with a different compiler will also make it fail.

> The problem appears to be reproducible only on pentium3 and athlon4 systems,
> not pentium4 system, where it appears to work as expected.  Reproduced on
> both 2.4.9-RH and 2.4.16 kernels.

[ program snipped ]

So there are two different issues at play here.  First, the kernel uses
the fninit instruction to initialize the fpu on first use.  Nothing in the
Intel manuals says anything about the fninit instruction clearing the mmx
or sse registers, and experimentally we now know for sure that it doesn't.  
That means that when the first time your program was ran it left 28 in
register xmm1.  The next time the program was run, the fninit did nothing
to clear register xmm1 so it still held 28.  Now, the pxor instruction
that is part of the m() function and intended to 0 out the xmm1 register
is an sse2 instruction.  It just so happens that it doesn't work on sse
only processors such as P3 CPUs.  So, when 28 was left in xmm1, then the
pxor failed to 0 out xmm1, we saved 28 as the starting value for the loop
and then looped through 7 additions until we had, you guessed it, 56.  In
fact, if you do a while :;do bad; done loop the it will increment by 28
each time it is run except when something else intervenes.  Replacing the
pxor instruction with xorps instead makes it work.  So, that's a bug in
gcc I suspect, using sse2 instructions when only called to use sse
instructions.  It seems odd to me that the CPU wouldn't generate an
illegal instruction exception, but oh well, it evidently doesn't.

So, we really should change arch/i386/kernel/i387.c something like this:

(WARNING, totally untested and not even compile checked change follows)

--- i387.c.save	Wed Apr 17 19:22:47 2002
+++ i387.c	Wed Apr 17 19:28:27 2002
@@ -33,8 +33,26 @@
 void init_fpu(void)
 {
 	__asm__("fninit");
-	if ( cpu_has_xmm )
+	if ( cpu_has_mmx )
+		asm volatile("xorq %%mm0, %%mm0;
+			      xorq %%mm1, %%mm1;
+			      xorq %%mm2, %%mm2;
+			      xorq %%mm3, %%mm3;
+			      xorq %%mm4, %%mm4;
+			      xorq %%mm5, %%mm5;
+			      xorq %%mm6, %%mm6;
+			      xorq %%mm7, %%mm7");
+	if ( cpu_has_xmm ) {
+		asm volatile("xorps %%xmm0, %%xmm0;
+			      xorps %%xmm1, %%xmm1;
+			      xorps %%xmm2, %%xmm2;
+			      xorps %%xmm3, %%xmm3;
+			      xorps %%xmm4, %%xmm4;
+			      xorps %%xmm5, %%xmm5;
+			      xorps %%xmm6, %%xmm6;
+			      xorps %%xmm7, %%xmm7");
 		load_mxcsr(0x1f80);
+	}
 		
 	current->used_math = 1;
 }


The rest of the problem is a gcc bug and possibly something that Intel 
should make a note of on the p3 processors (that is, that the p3 will 
silently fail to execute some sse2 instructions without generating the 
expected exception).

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606
  

^ permalink raw reply	[flat|nested] 61+ messages in thread
* SSE related security hole
@ 2002-04-17 14:51 Jan Hubicka
  2002-04-17 15:23 ` Jan Hubicka
  0 siblings, 1 reply; 61+ messages in thread
From: Jan Hubicka @ 2002-04-17 14:51 UTC (permalink / raw)
  To: bugtraq, linux-kernel, Pavel Machek, jakub, aj, ak

[-- Attachment #1: Type: text/plain, Size: 1072 bytes --]

Hi,
while debugging GCC bugreport, I noticed following behaviour of simple
program with no syscalls:

hubicka@nikam:~$ ./a.out
sum of 7 ints: 28
hubicka@nikam:~$ ./a.out
sum of 7 ints: 56
Bad sum (seen with gcc -O -march=pentiumpro -msse)
hubicka@nikam:~$ ./a.out
sum of 7 ints: 84
Bad sum (seen with gcc -O -march=pentiumpro -msse)
hubicka@nikam:~$ ./a.out
sum of 7 ints: 112
Bad sum (seen with gcc -O -march=pentiumpro -msse)
hubicka@nikam:~$ echo

hubicka@nikam:~$ ./a.out
sum of 7 ints: 28


ie it always returns different value, moreover when something else
is run in meantime (verified by loading WWW page served by same machine),
the counter is reinitialized to 28.

I am attaching the source, but it needs to be compiled by cfg-branch GCC
with settings -O2 -march=pentium3 -mfpmath=sse, so I've placed static
binary to http://atrey.karlin.mff.cuni.cz/~hubicka/badsum.bin

The problem appears to be reproducible only on pentium3 and athlon4 systems,
not pentium4 system, where it appears to work as expected.  Reproduced on
both 2.4.9-RH and 2.4.16 kernels.

Honza

[-- Attachment #2: badsum.c --]
[-- Type: text/x-csrc, Size: 495 bytes --]

#include <stdlib.h>
#include <stdio.h>

#define t(x) asm("rdtsc":"=A"(x));
	char str[256],b;
int m(){
	long long a,b,c,d;
int i,n=7;	//choose any n < sqrt(1/FLT_EPSILON)
    float comp,sum=0;
	sin(1);
    for(i=1;i<=n;++i)
	sum += i;
    sprintf(str,"sum of %d ints: %g\n",n,sum);
    comp = n*(n*.5f+.5f);
    if(sum != comp){
	    b=1;
	}
    printf(str);
    if (b)
	printf("Bad sum (seen with gcc -O -march=pentiumpro -msse)\n");
    return 0;
}
main()
{
	/*sin(1);*/
	asm("finit");
	m();
}

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2002-04-26 11:55 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20020418183639.20946.qmail@science.horizon.com.suse.lists.linux.kernel>
     [not found] ` <a9ncgs$2s2$1@cesium.transmeta.com.suse.lists.linux.kernel>
2002-04-19 14:06   ` SSE related security hole Andi Kleen
2002-04-19 18:00     ` Doug Ledford
2002-04-19 21:04       ` Andrea Arcangeli
2002-04-19 21:35         ` H. Peter Anvin
2002-04-19 21:42           ` Andi Kleen
2002-04-20  3:23             ` Andrea Arcangeli
2002-04-19 23:12           ` [PATCH] " Brian Gerst
2002-04-19 23:41             ` Linus Torvalds
2002-04-20  0:01               ` H. Peter Anvin
2002-04-20  0:09                 ` Linus Torvalds
2002-04-20  0:11                   ` Brian Gerst
2002-04-20  0:19                   ` H. Peter Anvin
2002-04-20  0:29                     ` Linus Torvalds
2002-04-20  0:31                   ` Alan Cox
2002-04-20  0:08               ` Brian Gerst
2002-04-20  0:21                 ` Linus Torvalds
2002-04-20  4:21                 ` Andrea Arcangeli
2002-04-20  4:35                   ` Linus Torvalds
2002-04-20  5:07                     ` Andrea Arcangeli
2002-04-20 16:27                       ` Linus Torvalds
2002-04-20 17:27                         ` Andrea Arcangeli
2002-04-20 17:38                           ` Linus Torvalds
2002-04-20 18:12                             ` Andrea Arcangeli
2002-04-20 19:30                               ` Linus Torvalds
2002-04-20 19:41                                 ` Andi Kleen
2002-04-20 21:28                                   ` Andrea Arcangeli
2002-04-20 22:43                                     ` H. Peter Anvin
2002-04-21  2:09                                       ` Andrea Arcangeli
2002-04-20 23:23                                     ` Linus Torvalds
2002-04-21  2:08                                       ` Andrea Arcangeli
2002-04-20 23:13                                   ` Linus Torvalds
2002-04-23 19:21                               ` Linus Torvalds
2002-04-23 20:05                                 ` H. Peter Anvin
2002-04-24  0:32                                 ` Andrea Arcangeli
2002-04-24  2:10                                   ` Linus Torvalds
2002-04-26  9:13                                     ` Pavel Machek
2002-04-26 11:55                                       ` Andrea Arcangeli
2002-04-19 22:18         ` Jan Hubicka
2002-04-22 22:24 Saxena, Sunil
     [not found] <200204182320.53095.nahshon@actcom.co.il>
2002-04-19 11:22 ` Alan Cox
  -- strict thread matches above, loose matches on Subject: below --
2002-04-18 18:36 linux
2002-04-18 18:53 ` Richard B. Johnson
2002-04-21 19:52   ` Pavel Machek
2002-04-21 22:11   ` David Wagner
2002-04-18 21:06 ` H. Peter Anvin
2002-04-17 23:42 Doug Ledford
2002-04-18  5:26 ` Andrea Arcangeli
2002-04-18  9:10   ` Arjan van de Ven
2002-04-18 11:18   ` Alan Cox
2002-04-18 11:14     ` Andi Kleen
2002-04-18 11:53       ` Alan Cox
2002-04-18 11:46         ` Andi Kleen
2002-04-18 11:55         ` Andi Kleen
2002-04-18 13:44   ` Doug Ledford
2002-04-18 19:20     ` Pavel Machek
2002-04-18 19:32       ` Doug Ledford
2002-04-21 19:54         ` Pavel Machek
2002-04-18  8:22 ` Andi Kleen
2002-04-17 14:51 Jan Hubicka
2002-04-17 15:23 ` Jan Hubicka
2002-04-18 14:57   ` Denis Vlasenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox