public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* RE: SSE related security hole
@ 2002-04-22 22:24 Saxena, Sunil
  2002-04-22 22:51 ` Initial process CPU state was " Andi Kleen
  0 siblings, 1 reply; 5+ messages in thread
From: Saxena, Sunil @ 2002-04-22 22:24 UTC (permalink / raw)
  To: 'rhkernel-dept-list@redhat.com'
  Cc: linux-kernel, jakub, aj, ak, pavel

Hi Everyone,

Sorry it took us some time to respond to this issue.

The Intel prescribed method for correct execution of SSE/SSE2 instructions
requires that software detect the support for these instructions via the
CPUID feature flags bits, prior to executing the instruction (see
<<ftp://download.intel.com/design/perftool/cbts/appnotes/ap900/cpuosid.pdf>>
). Also, Section 2.2 of the IA-32 Intel(R) Architecture Software Developer's
Manual, Vol 2, indicates that the use of the operand size prefix with
MMX/SSE/SSE2 instructions is reserved and may cause unpredictable behavior.


We recognized that there is a discrepancy in the individual instruction
descriptions in Vol 2 where it is indicated that the instruction would
generate a UD#. We will be rectifying this discrepancy in the next revision
of Vol 2 as well as via the monthly Specification Updates.

Thanks
Sunil


-----Original Message-----
From: Doug Ledford [mailto:dledford@redhat.com] 
Sent: Wednesday, April 17, 2002 4:56 PM
To: sunil.saxena@intel.com
Cc: linux-kernel@vger.kernel.org; jakub@redhat.com; aj@suse.de; ak@suse.de;
pavel@atrey.karlin.mff.cuni.cz
Subject: Re: SSE related security hole


Subject: Re: SSE related security hole
Reply-To: 
In-Reply-To: <200204171513.g3HFD3n04056@fenrus.demon.nl>; from
arjan@fenrus.demon.nl on Wed, Apr 17, 2002 at 04:13:03PM +0100

(NOTE: This may have already been answered by someone else, but I haven't 
seen it if it has, so I'm sending this through)

> Hi,
> while debugging GCC bugreport, I noticed following behaviour of simple
> program with no syscalls:
> 
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 28
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 56
> Bad sum (seen with gcc -O -march=pentiumpro -msse)
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 84
> Bad sum (seen with gcc -O -march=pentiumpro -msse)
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 112
> Bad sum (seen with gcc -O -march=pentiumpro -msse)
> hubicka@nikam:~$ echo
> 
> hubicka@nikam:~$ ./a.out
> sum of 7 ints: 28
> 
> 
> ie it always returns different value, moreover when something else
> is run in meantime (verified by loading WWW page served by same machine),
> the counter is reinitialized to 28.
> 
> I am attaching the source, but it needs to be compiled by cfg-branch GCC
> with settings -O2 -march=pentium3 -mfpmath=sse, so I've placed static
> binary to http://atrey.karlin.mff.cuni.cz/~hubicka/badsum.bin

Compiling the asm source with a different compiler will also make it fail.

> The problem appears to be reproducible only on pentium3 and athlon4
systems,
> not pentium4 system, where it appears to work as expected.  Reproduced on
> both 2.4.9-RH and 2.4.16 kernels.

[ program snipped ]

So there are two different issues at play here.  First, the kernel uses
the fninit instruction to initialize the fpu on first use.  Nothing in the
Intel manuals says anything about the fninit instruction clearing the mmx
or sse registers, and experimentally we now know for sure that it doesn't.  
That means that when the first time your program was ran it left 28 in
register xmm1.  The next time the program was run, the fninit did nothing
to clear register xmm1 so it still held 28.  Now, the pxor instruction
that is part of the m() function and intended to 0 out the xmm1 register
is an sse2 instruction.  It just so happens that it doesn't work on sse
only processors such as P3 CPUs.  So, when 28 was left in xmm1, then the
pxor failed to 0 out xmm1, we saved 28 as the starting value for the loop
and then looped through 7 additions until we had, you guessed it, 56.  In
fact, if you do a while :;do bad; done loop the it will increment by 28
each time it is run except when something else intervenes.  Replacing the
pxor instruction with xorps instead makes it work.  So, that's a bug in
gcc I suspect, using sse2 instructions when only called to use sse
instructions.  It seems odd to me that the CPU wouldn't generate an
illegal instruction exception, but oh well, it evidently doesn't.

So, we really should change arch/i386/kernel/i387.c something like this:

(WARNING, totally untested and not even compile checked change follows)

--- i387.c.save	Wed Apr 17 19:22:47 2002
+++ i387.c	Wed Apr 17 19:28:27 2002
@@ -33,8 +33,26 @@
 void init_fpu(void)
 {
 	__asm__("fninit");
-	if ( cpu_has_xmm )
+	if ( cpu_has_mmx )
+		asm volatile("xorq %%mm0, %%mm0;
+			      xorq %%mm1, %%mm1;
+			      xorq %%mm2, %%mm2;
+			      xorq %%mm3, %%mm3;
+			      xorq %%mm4, %%mm4;
+			      xorq %%mm5, %%mm5;
+			      xorq %%mm6, %%mm6;
+			      xorq %%mm7, %%mm7");
+	if ( cpu_has_xmm ) {
+		asm volatile("xorps %%xmm0, %%xmm0;
+			      xorps %%xmm1, %%xmm1;
+			      xorps %%xmm2, %%xmm2;
+			      xorps %%xmm3, %%xmm3;
+			      xorps %%xmm4, %%xmm4;
+			      xorps %%xmm5, %%xmm5;
+			      xorps %%xmm6, %%xmm6;
+			      xorps %%xmm7, %%xmm7");
 		load_mxcsr(0x1f80);
+	}
 		
 	current->used_math = 1;
 }


The rest of the problem is a gcc bug and possibly something that Intel 
should make a note of on the p3 processors (that is, that the p3 will 
silently fail to execute some sse2 instructions without generating the 
expected exception).

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606
  



_______________________________________________
rhkernel-dept-list mailing list
rhkernel-dept-list@redhat.com
http://post-office.corp.redhat.com/mailman/listinfo/rhkernel-dept-list

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Initial process CPU state was Re: SSE related security hole
  2002-04-22 22:24 SSE related security hole Saxena, Sunil
@ 2002-04-22 22:51 ` Andi Kleen
  2002-04-22 23:28   ` Linus Torvalds
  0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2002-04-22 22:51 UTC (permalink / raw)
  To: Saxena, Sunil; +Cc: linux-kernel

"Saxena, Sunil" <sunil.saxena@intel.com> writes:

Hallo Sunil,

> We recognized that there is a discrepancy in the individual instruction
> descriptions in Vol 2 where it is indicated that the instruction would
> generate a UD#. We will be rectifying this discrepancy in the next revision
> of Vol 2 as well as via the monthly Specification Updates.

Could you quickly describe what the Intel recommended way is to clear
the whole CPU at the beginning of a process? Is there a better way
than "save state with fxsave at bootup and restore into each
new process"? After all it would be a bit unfortunate to have
instructions which are transparently tolerant to new CPU state (fxsave/fxrstor 
for context switching), but no matching way to clear the same state for 
security reasons.  Using the bootup FXSAVE image would make linux
depend on the BIOS for this (so in the worst case when the bios 
doesn't clear e.g. the XMM registers or some future registers each 
process could see the state of some previous boot after a warm boot) 

Another way would be to do a fxsave after clearing of known state (x87,MMX,
SSE) at OS bootup and then afterwards set all the so far reserved parts of the 
FXSAVE image to zero. Then restore this image later into each new process.
This would avoid any BIOS/direct warmboot dependencies.  It would work 
assuming that all future IA32 state can be safely initialized with zeroes
via FXRSTOR. Is this a safe assumption?

Thanks, 
-Andi


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Initial process CPU state was Re: SSE related security hole
  2002-04-22 22:51 ` Initial process CPU state was " Andi Kleen
@ 2002-04-22 23:28   ` Linus Torvalds
  0 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2002-04-22 23:28 UTC (permalink / raw)
  To: linux-kernel

In article <m3ofgbcppe.fsf@averell.firstfloor.org>,
Andi Kleen  <ak@muc.de> wrote:
>
>Could you quickly describe what the Intel recommended way is to clear
>the whole CPU at the beginning of a process? Is there a better way
>than "save state with fxsave at bootup and restore into each
>new process"?

Note that I will _not_ accept a patch that does the "fxsave at bootup"
thing, because since Linux doesn't actually control the bootup, and
since it gets more an dmore likely that the BIOS will actually use the
SSE etc registers, a boot-time "fxsave" will mean that different
machines will have potentially quite different process initialization. 

In fact, even the same machine might act differently depending on how it
was booted up (ie warm vs cold vs resume boot, different BIOS path due
to different BIOS options etc). 

Now, that wouldn't be a security hole per se, but it would be hell to
debug problems ("My other machine that is identical doesn't show that
bug").

Basically there _needs_ to be an architected way to ensure that the FP
data is in a known and valid state.

(The "fxsave early" approach results in a valid - but not known -
state). 

>Another way would be to do a fxsave after clearing of known state (x87,MMX,
>SSE) at OS bootup and then afterwards set all the so far reserved parts of the 
>FXSAVE image to zero.

This is basically what we do right now (ie as of 2.5.9, just released). 

Except we set it to zero before, since the state we _do_ know about (ie
current x87, MMX, SSE) is initialized to exactly the state you mention
(by hand), and then the rest is just initialized to zero. 

		Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Initial process CPU state was Re: SSE related security hole
@ 2002-04-23 22:07 Saxena, Sunil
  2002-04-24 19:18 ` Bill Davidsen
  0 siblings, 1 reply; 5+ messages in thread
From: Saxena, Sunil @ 2002-04-23 22:07 UTC (permalink / raw)
  To: 'Andi Kleen'; +Cc: linux-kernel

The correct sequence for initializing MMX/SSE/SSE2 CPU state (I exchanged
mail with Linus) is:

	memset(&fxsave, 0, sizeof(struct i387_fxsave_struct));
	fxsave.cwd = 0x37f;
	fxsave.mxcsr = 0x1f80;

	fxrstor(&fxsave);

The above is architectural feature and you can expect this to work in the
future.   Intel will work to document this in our Monthly Specification
Updates and update "IA-32 Intel(R) Architecture Software Developer Manual"s.

Thanks
Sunil

-----Original Message-----
From: Andi Kleen [mailto:ak@muc.de] 
Sent: Monday, April 22, 2002 3:51 PM
To: Saxena, Sunil
Cc: linux-kernel@vger.kernel.org
Subject: Initial process CPU state was Re: SSE related security hole


"Saxena, Sunil" <sunil.saxena@intel.com> writes:

Hallo Sunil,

> We recognized that there is a discrepancy in the individual instruction
> descriptions in Vol 2 where it is indicated that the instruction would
> generate a UD#. We will be rectifying this discrepancy in the next
revision
> of Vol 2 as well as via the monthly Specification Updates.

Could you quickly describe what the Intel recommended way is to clear
the whole CPU at the beginning of a process? Is there a better way
than "save state with fxsave at bootup and restore into each
new process"? After all it would be a bit unfortunate to have
instructions which are transparently tolerant to new CPU state
(fxsave/fxrstor 
for context switching), but no matching way to clear the same state for 
security reasons.  Using the bootup FXSAVE image would make linux
depend on the BIOS for this (so in the worst case when the bios 
doesn't clear e.g. the XMM registers or some future registers each 
process could see the state of some previous boot after a warm boot) 

Another way would be to do a fxsave after clearing of known state (x87,MMX,
SSE) at OS bootup and then afterwards set all the so far reserved parts of
the 
FXSAVE image to zero. Then restore this image later into each new process.
This would avoid any BIOS/direct warmboot dependencies.  It would work 
assuming that all future IA32 state can be safely initialized with zeroes
via FXRSTOR. Is this a safe assumption?

Thanks, 
-Andi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Initial process CPU state was Re: SSE related security hole
  2002-04-23 22:07 Saxena, Sunil
@ 2002-04-24 19:18 ` Bill Davidsen
  0 siblings, 0 replies; 5+ messages in thread
From: Bill Davidsen @ 2002-04-24 19:18 UTC (permalink / raw)
  To: Saxena, Sunil; +Cc: 'Andi Kleen', linux-kernel

On Tue, 23 Apr 2002, Saxena, Sunil wrote:

> The correct sequence for initializing MMX/SSE/SSE2 CPU state (I exchanged
> mail with Linus) is:
> 
> 	memset(&fxsave, 0, sizeof(struct i387_fxsave_struct));
> 	fxsave.cwd = 0x37f;
> 	fxsave.mxcsr = 0x1f80;
> 
> 	fxrstor(&fxsave);
> 
> The above is architectural feature and you can expect this to work in the
> future.   Intel will work to document this in our Monthly Specification
> Updates and update "IA-32 Intel(R) Architecture Software Developer Manual"s.

Sure is nice to see some vendor support!

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-04-24 19:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-04-22 22:24 SSE related security hole Saxena, Sunil
2002-04-22 22:51 ` Initial process CPU state was " Andi Kleen
2002-04-22 23:28   ` Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2002-04-23 22:07 Saxena, Sunil
2002-04-24 19:18 ` Bill Davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox