* [patch] x86: fix taking DNA during 64bit sigreturn @ 2007-11-11 19:27 Siddha, Suresh B 2007-11-12 19:08 ` Linus Torvalds 0 siblings, 1 reply; 6+ messages in thread From: Siddha, Suresh B @ 2007-11-11 19:27 UTC (permalink / raw) To: torvalds, ak, linux-kernel; +Cc: mingo, hpa, tglx, akpm restore sigcontext is taking a DNA exception while restoring FP context from the user stack, during the sigreturn. Appended patch fixes it by doing clts() if the app doesn't touch FP during the signal handler execution. This will stop generating a DNA, during the fxrstor in the sigreturn. This improves 64-bit lat_sig numbers by ~30% on my core2 platform. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> --- diff --git a/arch/x86/kernel/i387_64.c b/arch/x86/kernel/i387_64.c index 56c1f11..bfaff28 100644 --- a/arch/x86/kernel/i387_64.c +++ b/arch/x86/kernel/i387_64.c @@ -92,13 +92,14 @@ int save_i387(struct _fpstate __user *buf) if (task_thread_info(tsk)->status & TS_USEDFPU) { err = save_i387_checking((struct i387_fxsave_struct __user *)buf); if (err) return err; + task_thread_info(tsk)->status &= ~TS_USEDFPU; stts(); - } else { - if (__copy_to_user(buf, &tsk->thread.i387.fxsave, + } else { + if (__copy_to_user(buf, &tsk->thread.i387.fxsave, sizeof(struct i387_fxsave_struct))) return -1; - } - return 1; + } + return 1; } /* diff --git a/include/asm-x86/i387_64.h b/include/asm-x86/i387_64.h index 0217b74..3a4ffba 100644 --- a/include/asm-x86/i387_64.h +++ b/include/asm-x86/i387_64.h @@ -203,6 +203,11 @@ static inline void save_init_fpu(struct task_struct *tsk) */ static inline int restore_i387(struct _fpstate __user *buf) { + set_used_math(); + if (!(task_thread_info(current)->status & TS_USEDFPU)) { + clts(); + task_thread_info(current)->status |= TS_USEDFPU; + } return restore_fpu_checking((__force struct i387_fxsave_struct *)buf); } ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [patch] x86: fix taking DNA during 64bit sigreturn 2007-11-11 19:27 [patch] x86: fix taking DNA during 64bit sigreturn Siddha, Suresh B @ 2007-11-12 19:08 ` Linus Torvalds 2007-11-12 19:33 ` H. Peter Anvin 2007-11-12 21:16 ` Andi Kleen 0 siblings, 2 replies; 6+ messages in thread From: Linus Torvalds @ 2007-11-12 19:08 UTC (permalink / raw) To: Siddha, Suresh B; +Cc: ak, linux-kernel, mingo, hpa, tglx, akpm On Sun, 11 Nov 2007, Siddha, Suresh B wrote: > > restore sigcontext is taking a DNA exception while restoring FP context from > the user stack, during the sigreturn. Appended patch fixes it by doing clts() > if the app doesn't touch FP during the signal handler execution. This will > stop generating a DNA, during the fxrstor in the sigreturn. > > This improves 64-bit lat_sig numbers by ~30% on my core2 platform. The sad part here is that the 32-bit kernel doesn't seem to have this problem at all, and has it fixed thanks to just having cleaner abstractions (eg the "unlazy_fpu()" function will just automatically clear TS_USEDFPU as part of __save_init_fpu()). In comparison, the x86-64 math handling is a total friggin mess. Maybe it's not *beautiful* in the 32-bit path either, but most of the complexity in the 32-bit stuff is due to just the complexity of interactions. In the 64-bit code, the complexity seems to come largely from just being crap, crap, crap. The *real* fix for this is almost certainly to just get rid of the 64-bit code entirely, and use the 32-bit code as the base for one single unified setup. The 32-bit code should be largely a superset of the 64-bit code anyway, since it has to handle more cases, and does it more cleanly. Oh, well. Your patch looks good for now. Btw: the 64-bit code looks like it might have preemption issues too. Another bug that seems to be fixed in the 32-bit codepath. Anybody willing to try to do that unification around the 32-bit code? The biggest issue with the 64-bit code is to avoid the extra rex prefix for "fxsave", which means that the fxsave/fxrstor thing is done with "rex64/fxsave %P2(%1)" : "m" (tsk->thread.i387.fxsave) : "cdaSDb" (tsk), "i" (offsetof(__typeof__(*tsk), thread.i387.fxsave))); which isn't exactly pretty, but the memory address generation works fine in 32-bit code too, and the rex override is easily done with #ifdef CONFIG_X86_64 #define REX64 "rex64/" #else #define REX64 "" #endif and then you just use REX64 "fxsave" in the asm statements instead. So I'll apply this patch for 2.6.24, but considering that it really looks like x86-64 is preempt-broken too, if somebody does a larger patch that does the above merge and tests it, I would be open to apply that too. It does seem to be a real bug (having a scheduling event in the signal path between the actual fxsave and the clearing of the bits would *seem* to possibly screw up the FP state!) But maybe I'm missing some reason why it doesn't matter. The 32-bit code was fixed back in 2003 (commit 5bff44fc272b948a85e893a007d01b9dfb3ad04f "Be a lot more careful about TS_USEDFPU and preemption" in the historical tree with a comment like this: We had some races where we tested (or set) TS_USEDFPU together with sequences that depended on the setting (like clearing or setting the TS flag in %cr0) and we could be preempted in between, which screws up the FPU state, since preemption will itself change USEDFPU and the TS flag. and maybe it's just the 32-bit tree being unnecessarily careful. It's long enough ago that I don't remember the details, but I have a dim memory of there actually being some state corruption involved. Linus ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch] x86: fix taking DNA during 64bit sigreturn 2007-11-12 19:08 ` Linus Torvalds @ 2007-11-12 19:33 ` H. Peter Anvin 2007-11-12 21:16 ` Andi Kleen 1 sibling, 0 replies; 6+ messages in thread From: H. Peter Anvin @ 2007-11-12 19:33 UTC (permalink / raw) To: Linus Torvalds; +Cc: Siddha, Suresh B, ak, linux-kernel, mingo, tglx, akpm Linus Torvalds wrote: > Anybody willing to try to do that unification around the 32-bit code? Heck yes. I'll grab this one. -hpa ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch] x86: fix taking DNA during 64bit sigreturn 2007-11-12 19:08 ` Linus Torvalds 2007-11-12 19:33 ` H. Peter Anvin @ 2007-11-12 21:16 ` Andi Kleen 2007-11-13 0:11 ` H. Peter Anvin 1 sibling, 1 reply; 6+ messages in thread From: Andi Kleen @ 2007-11-12 21:16 UTC (permalink / raw) To: Linus Torvalds; +Cc: Siddha, Suresh B, linux-kernel, mingo, hpa, tglx, akpm > > The *real* fix for this is almost certainly to just get rid of the 64-bit > code entirely, and use the 32-bit code as the base for one single unified > setup. That would likely break the ABI. x86-64 ABI is completely different here -- no ibcs, just pure x86 ISA. I always thought direct FXSAVE from/to user space to be a cute trick, but yes the exception Suresh noticed makes it lose some of its beauty. > The 32-bit code should be largely a superset of the 64-bit code > anyway, since it has to handle more cases, and does it more cleanly. If you consider compat code 64bit handles as many cases as 32bit. > which isn't exactly pretty, but the memory address generation works fine > in 32-bit code too, and the rex override is easily done with > > #ifdef CONFIG_X86_64 > #define REX64 "rex64/" > #else > #define REX64 "" > #endif > > and then you just use > > REX64 "fxsave" That didn't work on older assemblers. > But maybe I'm missing some reason why it doesn't matter. The 32-bit code > was fixed back in 2003 (commit 5bff44fc272b948a85e893a007d01b9dfb3ad04f 64bit FPU semantics are somewhat different. I don't remember if this particular issue was addressed or not, but I fixed a few shared bugs in a quite different way on 64bit vs 32bit. If anybody wants to change something here don't assume they are the same. >From a cursory look it's probably broken though. -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch] x86: fix taking DNA during 64bit sigreturn 2007-11-12 21:16 ` Andi Kleen @ 2007-11-13 0:11 ` H. Peter Anvin 2007-11-13 0:43 ` Andi Kleen 0 siblings, 1 reply; 6+ messages in thread From: H. Peter Anvin @ 2007-11-13 0:11 UTC (permalink / raw) To: Andi Kleen Cc: Linus Torvalds, Siddha, Suresh B, linux-kernel, mingo, tglx, akpm Andi Kleen wrote: >> The *real* fix for this is almost certainly to just get rid of the 64-bit >> code entirely, and use the 32-bit code as the base for one single unified >> setup. > > That would likely break the ABI. x86-64 ABI is completely different here -- > no ibcs, just pure x86 ISA. > Different ABIs clearly have to be handled, but I don't think that is a huge deal. The i387 code overall (not just asm/i387.h) is very different, though, and I find it unlikely that a properly unified code is going to be ready and working in the 2.6.24 timeframe. I'm exploring if a partial merge with high confidence level is feasible; either way, a proper merge for 2.6.25 is probably the right thing. -hpa ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch] x86: fix taking DNA during 64bit sigreturn 2007-11-13 0:11 ` H. Peter Anvin @ 2007-11-13 0:43 ` Andi Kleen 0 siblings, 0 replies; 6+ messages in thread From: Andi Kleen @ 2007-11-13 0:43 UTC (permalink / raw) To: H. Peter Anvin Cc: Linus Torvalds, Siddha, Suresh B, linux-kernel, mingo, tglx, akpm On Tuesday 13 November 2007 01:11, H. Peter Anvin wrote: > Andi Kleen wrote: > >> The *real* fix for this is almost certainly to just get rid of the > >> 64-bit code entirely, and use the 32-bit code as the base for one single > >> unified setup. > > > > That would likely break the ABI. x86-64 ABI is completely different here > > -- no ibcs, just pure x86 ISA. > > Different ABIs clearly have to be handled, Well all four cases are quite different. > but I don't think that is a > huge deal. The i387 code overall (not just asm/i387.h) is very > different, Yes because x86-64 is vastly simpler. e.g. the 64bit path does not touch the state itself at all. And all the state convert code moved into the compat layer only. There were also various other cleanups. Also 64bit e.g. defers some of the checking to the context switch, which also simplifies things. > though, and I find it unlikely that a properly unified code > is going to be ready and working in the 2.6.24 timeframe. To be honest it is unclear to me how you can do better than direct FXSAVE from/to user space on 64bit. Sure you could put a manual convert pass inbetween like i386, but that would be rather pointless because it matches 1:1 and in fact would make the code to be worse (no transparent support of future FXSAVE extensions) . Not sure what tripped Linus up so much either. Was it the broken white space? > I'm exploring > if a partial merge with high confidence level is feasible; either way, a > proper merge for 2.6.25 is probably the right thing. At least on 64bit the result would be probably more complicated and likely slower code. If anything it might make sense to add a few more of the 64bit improvements to 32bit (e.g. delayed exception checking). The direct FXSAVE path cannot be moved over because it would break the ABI, otherwise I would have done that long ago. BTW if you're looking for real fixes to do iirc some of the FPU context switch workarounds are missing in KVM which does mostly its own thing currently. -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-11-13 0:47 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-11-11 19:27 [patch] x86: fix taking DNA during 64bit sigreturn Siddha, Suresh B 2007-11-12 19:08 ` Linus Torvalds 2007-11-12 19:33 ` H. Peter Anvin 2007-11-12 21:16 ` Andi Kleen 2007-11-13 0:11 ` H. Peter Anvin 2007-11-13 0:43 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox