* [patch] x86: fix taking DNA during 64bit sigreturn
@ 2007-11-11 19:27 Siddha, Suresh B
2007-11-12 19:08 ` Linus Torvalds
0 siblings, 1 reply; 6+ messages in thread
From: Siddha, Suresh B @ 2007-11-11 19:27 UTC (permalink / raw)
To: torvalds, ak, linux-kernel; +Cc: mingo, hpa, tglx, akpm
restore sigcontext is taking a DNA exception while restoring FP context from
the user stack, during the sigreturn. Appended patch fixes it by doing clts()
if the app doesn't touch FP during the signal handler execution. This will
stop generating a DNA, during the fxrstor in the sigreturn.
This improves 64-bit lat_sig numbers by ~30% on my core2 platform.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
---
diff --git a/arch/x86/kernel/i387_64.c b/arch/x86/kernel/i387_64.c
index 56c1f11..bfaff28 100644
--- a/arch/x86/kernel/i387_64.c
+++ b/arch/x86/kernel/i387_64.c
@@ -92,13 +92,14 @@ int save_i387(struct _fpstate __user *buf)
if (task_thread_info(tsk)->status & TS_USEDFPU) {
err = save_i387_checking((struct i387_fxsave_struct __user *)buf);
if (err) return err;
+ task_thread_info(tsk)->status &= ~TS_USEDFPU;
stts();
- } else {
- if (__copy_to_user(buf, &tsk->thread.i387.fxsave,
+ } else {
+ if (__copy_to_user(buf, &tsk->thread.i387.fxsave,
sizeof(struct i387_fxsave_struct)))
return -1;
- }
- return 1;
+ }
+ return 1;
}
/*
diff --git a/include/asm-x86/i387_64.h b/include/asm-x86/i387_64.h
index 0217b74..3a4ffba 100644
--- a/include/asm-x86/i387_64.h
+++ b/include/asm-x86/i387_64.h
@@ -203,6 +203,11 @@ static inline void save_init_fpu(struct task_struct *tsk)
*/
static inline int restore_i387(struct _fpstate __user *buf)
{
+ set_used_math();
+ if (!(task_thread_info(current)->status & TS_USEDFPU)) {
+ clts();
+ task_thread_info(current)->status |= TS_USEDFPU;
+ }
return restore_fpu_checking((__force struct i387_fxsave_struct *)buf);
}
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [patch] x86: fix taking DNA during 64bit sigreturn
2007-11-11 19:27 [patch] x86: fix taking DNA during 64bit sigreturn Siddha, Suresh B
@ 2007-11-12 19:08 ` Linus Torvalds
2007-11-12 19:33 ` H. Peter Anvin
2007-11-12 21:16 ` Andi Kleen
0 siblings, 2 replies; 6+ messages in thread
From: Linus Torvalds @ 2007-11-12 19:08 UTC (permalink / raw)
To: Siddha, Suresh B; +Cc: ak, linux-kernel, mingo, hpa, tglx, akpm
On Sun, 11 Nov 2007, Siddha, Suresh B wrote:
>
> restore sigcontext is taking a DNA exception while restoring FP context from
> the user stack, during the sigreturn. Appended patch fixes it by doing clts()
> if the app doesn't touch FP during the signal handler execution. This will
> stop generating a DNA, during the fxrstor in the sigreturn.
>
> This improves 64-bit lat_sig numbers by ~30% on my core2 platform.
The sad part here is that the 32-bit kernel doesn't seem to have this
problem at all, and has it fixed thanks to just having cleaner
abstractions (eg the "unlazy_fpu()" function will just automatically clear
TS_USEDFPU as part of __save_init_fpu()).
In comparison, the x86-64 math handling is a total friggin mess. Maybe
it's not *beautiful* in the 32-bit path either, but most of the complexity
in the 32-bit stuff is due to just the complexity of interactions. In the
64-bit code, the complexity seems to come largely from just being crap,
crap, crap.
The *real* fix for this is almost certainly to just get rid of the 64-bit
code entirely, and use the 32-bit code as the base for one single unified
setup. The 32-bit code should be largely a superset of the 64-bit code
anyway, since it has to handle more cases, and does it more cleanly.
Oh, well. Your patch looks good for now.
Btw: the 64-bit code looks like it might have preemption issues too.
Another bug that seems to be fixed in the 32-bit codepath.
Anybody willing to try to do that unification around the 32-bit code? The
biggest issue with the 64-bit code is to avoid the extra rex prefix for
"fxsave", which means that the fxsave/fxrstor thing is done with
"rex64/fxsave %P2(%1)"
: "m" (tsk->thread.i387.fxsave)
: "cdaSDb" (tsk),
"i" (offsetof(__typeof__(*tsk), thread.i387.fxsave)));
which isn't exactly pretty, but the memory address generation works fine
in 32-bit code too, and the rex override is easily done with
#ifdef CONFIG_X86_64
#define REX64 "rex64/"
#else
#define REX64 ""
#endif
and then you just use
REX64 "fxsave"
in the asm statements instead.
So I'll apply this patch for 2.6.24, but considering that it really looks
like x86-64 is preempt-broken too, if somebody does a larger patch that
does the above merge and tests it, I would be open to apply that too. It
does seem to be a real bug (having a scheduling event in the signal path
between the actual fxsave and the clearing of the bits would *seem* to
possibly screw up the FP state!)
But maybe I'm missing some reason why it doesn't matter. The 32-bit code
was fixed back in 2003 (commit 5bff44fc272b948a85e893a007d01b9dfb3ad04f
"Be a lot more careful about TS_USEDFPU and preemption" in the historical
tree with a comment like this:
We had some races where we tested (or set) TS_USEDFPU together
with sequences that depended on the setting (like clearing or
setting the TS flag in %cr0) and we could be preempted in between,
which screws up the FPU state, since preemption will itself change
USEDFPU and the TS flag.
and maybe it's just the 32-bit tree being unnecessarily careful. It's long
enough ago that I don't remember the details, but I have a dim memory of
there actually being some state corruption involved.
Linus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch] x86: fix taking DNA during 64bit sigreturn
2007-11-12 19:08 ` Linus Torvalds
@ 2007-11-12 19:33 ` H. Peter Anvin
2007-11-12 21:16 ` Andi Kleen
1 sibling, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2007-11-12 19:33 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Siddha, Suresh B, ak, linux-kernel, mingo, tglx, akpm
Linus Torvalds wrote:
> Anybody willing to try to do that unification around the 32-bit code?
Heck yes. I'll grab this one.
-hpa
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch] x86: fix taking DNA during 64bit sigreturn
2007-11-12 19:08 ` Linus Torvalds
2007-11-12 19:33 ` H. Peter Anvin
@ 2007-11-12 21:16 ` Andi Kleen
2007-11-13 0:11 ` H. Peter Anvin
1 sibling, 1 reply; 6+ messages in thread
From: Andi Kleen @ 2007-11-12 21:16 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Siddha, Suresh B, linux-kernel, mingo, hpa, tglx, akpm
>
> The *real* fix for this is almost certainly to just get rid of the 64-bit
> code entirely, and use the 32-bit code as the base for one single unified
> setup.
That would likely break the ABI. x86-64 ABI is completely different here --
no ibcs, just pure x86 ISA.
I always thought direct FXSAVE from/to user space to be a cute trick, but yes
the exception Suresh noticed makes it lose some of its beauty.
> The 32-bit code should be largely a superset of the 64-bit code
> anyway, since it has to handle more cases, and does it more cleanly.
If you consider compat code 64bit handles as many cases as 32bit.
> which isn't exactly pretty, but the memory address generation works fine
> in 32-bit code too, and the rex override is easily done with
>
> #ifdef CONFIG_X86_64
> #define REX64 "rex64/"
> #else
> #define REX64 ""
> #endif
>
> and then you just use
>
> REX64 "fxsave"
That didn't work on older assemblers.
> But maybe I'm missing some reason why it doesn't matter. The 32-bit code
> was fixed back in 2003 (commit 5bff44fc272b948a85e893a007d01b9dfb3ad04f
64bit FPU semantics are somewhat different. I don't remember if this
particular issue was addressed or not, but I fixed a few shared
bugs in a quite different way on 64bit vs 32bit. If anybody wants
to change something here don't assume they are the same.
>From a cursory look it's probably broken though.
-Andi
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch] x86: fix taking DNA during 64bit sigreturn
2007-11-12 21:16 ` Andi Kleen
@ 2007-11-13 0:11 ` H. Peter Anvin
2007-11-13 0:43 ` Andi Kleen
0 siblings, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2007-11-13 0:11 UTC (permalink / raw)
To: Andi Kleen
Cc: Linus Torvalds, Siddha, Suresh B, linux-kernel, mingo, tglx, akpm
Andi Kleen wrote:
>> The *real* fix for this is almost certainly to just get rid of the 64-bit
>> code entirely, and use the 32-bit code as the base for one single unified
>> setup.
>
> That would likely break the ABI. x86-64 ABI is completely different here --
> no ibcs, just pure x86 ISA.
>
Different ABIs clearly have to be handled, but I don't think that is a
huge deal. The i387 code overall (not just asm/i387.h) is very
different, though, and I find it unlikely that a properly unified code
is going to be ready and working in the 2.6.24 timeframe. I'm exploring
if a partial merge with high confidence level is feasible; either way, a
proper merge for 2.6.25 is probably the right thing.
-hpa
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch] x86: fix taking DNA during 64bit sigreturn
2007-11-13 0:11 ` H. Peter Anvin
@ 2007-11-13 0:43 ` Andi Kleen
0 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 2007-11-13 0:43 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Linus Torvalds, Siddha, Suresh B, linux-kernel, mingo, tglx, akpm
On Tuesday 13 November 2007 01:11, H. Peter Anvin wrote:
> Andi Kleen wrote:
> >> The *real* fix for this is almost certainly to just get rid of the
> >> 64-bit code entirely, and use the 32-bit code as the base for one single
> >> unified setup.
> >
> > That would likely break the ABI. x86-64 ABI is completely different here
> > -- no ibcs, just pure x86 ISA.
>
> Different ABIs clearly have to be handled,
Well all four cases are quite different.
> but I don't think that is a
> huge deal. The i387 code overall (not just asm/i387.h) is very
> different,
Yes because x86-64 is vastly simpler. e.g. the 64bit path
does not touch the state itself at all. And all the state convert
code moved into the compat layer only. There were also
various other cleanups.
Also 64bit e.g. defers some of the checking to the context
switch, which also simplifies things.
> though, and I find it unlikely that a properly unified code
> is going to be ready and working in the 2.6.24 timeframe.
To be honest it is unclear to me how you can do better than
direct FXSAVE from/to user space on 64bit. Sure you could
put a manual convert pass inbetween like i386, but that
would be rather pointless because it matches 1:1 and in fact would
make the code to be worse (no transparent support of future
FXSAVE extensions) . Not sure what tripped
Linus up so much either. Was it the broken white space?
> I'm exploring
> if a partial merge with high confidence level is feasible; either way, a
> proper merge for 2.6.25 is probably the right thing.
At least on 64bit the result would be probably more complicated
and likely slower code.
If anything it might make sense to add a few more of the 64bit
improvements to 32bit (e.g. delayed exception checking).
The direct FXSAVE path cannot be moved over because it
would break the ABI, otherwise I would have done that
long ago.
BTW if you're looking for real fixes to do iirc some of the FPU
context switch workarounds are missing in KVM which
does mostly its own thing currently.
-Andi
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-11-13 0:47 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-11 19:27 [patch] x86: fix taking DNA during 64bit sigreturn Siddha, Suresh B
2007-11-12 19:08 ` Linus Torvalds
2007-11-12 19:33 ` H. Peter Anvin
2007-11-12 21:16 ` Andi Kleen
2007-11-13 0:11 ` H. Peter Anvin
2007-11-13 0:43 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox