* lazy fpu switch irrelavant to no-fpu case?
@ 2002-02-22 2:48 Jun Sun
2002-02-22 3:57 ` Jun Sun
2002-02-22 9:45 ` Kevin D. Kissell
0 siblings, 2 replies; 9+ messages in thread
From: Jun Sun @ 2002-02-22 2:48 UTC (permalink / raw)
To: linux-mips
It appears to me that lazy fpu switch has no relevancy to CPUs that don't have
FPU.
If you do a scan, you will see last_task_used_math are used in four kernel
files:
ptrace.c
process.c
signal.c
traps.c
In the case of ptrace.c and process.c, the variable is used only when CPU has
FPU.
In the case of traps.c (do_cpu()), it used redaundantly with another condition
checking.
In the case of signal.c, no matter what last_task_used_math is, the same code
will be executed anyway.
Now think about it, it actually makes sense - if we don't have hardware FPU,
why do we care of fpu context switch.
Anyhow, the problem I am seeing with FPU/SMP case seems to be caused by FPU
emulation code itself, if we can assume it is not caused by fpu context
switch. Right now the FPU is not turned on on the box.
The following patch cleans it up a little based on the above observation.
Make sense?
Jun
diff -Nru linux/arch/mips/kernel/traps.c.orig linux/arch/mips/kernel/traps.c
--- linux/arch/mips/kernel/traps.c.orig Wed Jan 30 15:17:12 2002
+++ linux/arch/mips/kernel/traps.c Thu Feb 21 18:46:28 2002
@@ -678,14 +678,11 @@
return;
fp_emul:
- if (last_task_used_math != current) {
- if (!current->used_math) {
- fpu_emulator_init_fpu();
- current->used_math = 1;
- }
+ if (!current->used_math) {
+ fpu_emulator_init_fpu();
+ current->used_math = 1;
}
sig = fpu_emulator_cop1Handler(regs);
- last_task_used_math = current;
if (sig)
force_sig(sig, current);
return;
^ permalink raw reply [flat|nested] 9+ messages in thread* ieee754_csr is the problem (Re: lazy fpu switch irrelavant to no-fpu case? @ 2002-02-22 3:57 ` Jun Sun 0 siblings, 0 replies; 9+ messages in thread From: Jun Sun @ 2002-02-22 3:57 UTC (permalink / raw) To: linux-mips [-- Attachment #1: Type: text/plain, Size: 697 bytes --] Jun Sun wrote: > Anyhow, the problem I am seeing with FPU/SMP case seems to be caused by FPU > emulation code itself, if we can assume it is not caused by fpu context > switch. Right now the FPU is not turned on on the box. > OK, I found the guilt part in FPU emul. It is the global variable ieee754_csr. The following patch seems to fix the problem. I am sure someone who are more familiar with FPU might be able to make it more elegant. There is another global variable which is potentially dangerous for SMP. It is fpuemuprivate. Currently it is used in almost used for accounting and read-only purpose. I did not bother to change it. It should be fixed too, I suppose. Cheers. Jun [-- Attachment #2: patch3 --] [-- Type: text/plain, Size: 2183 bytes --] diff -Nru linux/arch/mips/math-emu/ieee754.h.orig linux/arch/mips/math-emu/ieee754.h --- linux/arch/mips/math-emu/ieee754.h.orig Thu Jan 31 17:13:26 2002 +++ linux/arch/mips/math-emu/ieee754.h Thu Feb 21 19:34:06 2002 @@ -323,7 +323,7 @@ /* the control status register */ -struct ieee754_csr { +struct ieee754_csr_struct { unsigned pad:13; unsigned nod:1; /* set 1 for no denormalised numbers */ unsigned cx:5; /* exceptions this operation */ @@ -331,7 +331,13 @@ unsigned sx:5; /* exceptions total */ unsigned rm:2; /* current rounding mode */ }; -extern struct ieee754_csr ieee754_csr; + +#include <linux/sched.h> +#include <linux/threads.h> +#include <linux/smp.h> +#include <asm/current.h> +extern struct ieee754_csr_struct ieee754_csr_array[NR_CPUS]; +#define ieee754_csr ieee754_csr_array[smp_processor_id()] static __inline unsigned ieee754_getrm(void) { diff -Nru linux/arch/mips/math-emu/ieee754.c.orig linux/arch/mips/math-emu/ieee754.c --- linux/arch/mips/math-emu/ieee754.c.orig Mon Jan 28 11:17:14 2002 +++ linux/arch/mips/math-emu/ieee754.c Thu Feb 21 19:37:32 2002 @@ -52,7 +52,7 @@ /* the control status register */ -struct ieee754_csr ieee754_csr; +struct ieee754_csr_struct ieee754_csr_array[NR_CPUS]; /* special constants */ diff -Nru linux/arch/mips/math-emu/cp1emu.c.orig linux/arch/mips/math-emu/cp1emu.c --- linux/arch/mips/math-emu/cp1emu.c.orig Mon Jan 28 11:17:14 2002 +++ linux/arch/mips/math-emu/cp1emu.c Thu Feb 21 19:22:45 2002 @@ -945,7 +945,7 @@ static ieee754##p fpemu_##p##_##name (ieee754##p r, ieee754##p s, \ ieee754##p t) \ { \ - struct ieee754_csr ieee754_csr_save; \ + struct ieee754_csr_struct ieee754_csr_save; \ s = f1 (s, t); \ ieee754_csr_save = ieee754_csr; \ s = f2 (s, r); \ diff -Nru linux/arch/mips/math-emu/dp_sqrt.c.orig linux/arch/mips/math-emu/dp_sqrt.c --- linux/arch/mips/math-emu/dp_sqrt.c.orig Thu Feb 21 19:41:09 2002 +++ linux/arch/mips/math-emu/dp_sqrt.c Thu Feb 21 19:39:08 2002 @@ -37,7 +37,7 @@ ieee754dp ieee754dp_sqrt(ieee754dp x) { - struct ieee754_csr oldcsr; + struct ieee754_csr_struct oldcsr; ieee754dp y, z, t; unsigned scalx, yh; COMPXDP; ^ permalink raw reply [flat|nested] 9+ messages in thread
* ieee754_csr is the problem (Re: lazy fpu switch irrelavant to no-fpu case? @ 2002-02-22 3:57 ` Jun Sun 0 siblings, 0 replies; 9+ messages in thread From: Jun Sun @ 2002-02-22 3:57 UTC (permalink / raw) To: linux-mips [-- Attachment #1: Type: text/plain, Size: 697 bytes --] Jun Sun wrote: > Anyhow, the problem I am seeing with FPU/SMP case seems to be caused by FPU > emulation code itself, if we can assume it is not caused by fpu context > switch. Right now the FPU is not turned on on the box. > OK, I found the guilt part in FPU emul. It is the global variable ieee754_csr. The following patch seems to fix the problem. I am sure someone who are more familiar with FPU might be able to make it more elegant. There is another global variable which is potentially dangerous for SMP. It is fpuemuprivate. Currently it is used in almost used for accounting and read-only purpose. I did not bother to change it. It should be fixed too, I suppose. Cheers. Jun [-- Attachment #2: patch3 --] [-- Type: text/plain, Size: 2183 bytes --] diff -Nru linux/arch/mips/math-emu/ieee754.h.orig linux/arch/mips/math-emu/ieee754.h --- linux/arch/mips/math-emu/ieee754.h.orig Thu Jan 31 17:13:26 2002 +++ linux/arch/mips/math-emu/ieee754.h Thu Feb 21 19:34:06 2002 @@ -323,7 +323,7 @@ /* the control status register */ -struct ieee754_csr { +struct ieee754_csr_struct { unsigned pad:13; unsigned nod:1; /* set 1 for no denormalised numbers */ unsigned cx:5; /* exceptions this operation */ @@ -331,7 +331,13 @@ unsigned sx:5; /* exceptions total */ unsigned rm:2; /* current rounding mode */ }; -extern struct ieee754_csr ieee754_csr; + +#include <linux/sched.h> +#include <linux/threads.h> +#include <linux/smp.h> +#include <asm/current.h> +extern struct ieee754_csr_struct ieee754_csr_array[NR_CPUS]; +#define ieee754_csr ieee754_csr_array[smp_processor_id()] static __inline unsigned ieee754_getrm(void) { diff -Nru linux/arch/mips/math-emu/ieee754.c.orig linux/arch/mips/math-emu/ieee754.c --- linux/arch/mips/math-emu/ieee754.c.orig Mon Jan 28 11:17:14 2002 +++ linux/arch/mips/math-emu/ieee754.c Thu Feb 21 19:37:32 2002 @@ -52,7 +52,7 @@ /* the control status register */ -struct ieee754_csr ieee754_csr; +struct ieee754_csr_struct ieee754_csr_array[NR_CPUS]; /* special constants */ diff -Nru linux/arch/mips/math-emu/cp1emu.c.orig linux/arch/mips/math-emu/cp1emu.c --- linux/arch/mips/math-emu/cp1emu.c.orig Mon Jan 28 11:17:14 2002 +++ linux/arch/mips/math-emu/cp1emu.c Thu Feb 21 19:22:45 2002 @@ -945,7 +945,7 @@ static ieee754##p fpemu_##p##_##name (ieee754##p r, ieee754##p s, \ ieee754##p t) \ { \ - struct ieee754_csr ieee754_csr_save; \ + struct ieee754_csr_struct ieee754_csr_save; \ s = f1 (s, t); \ ieee754_csr_save = ieee754_csr; \ s = f2 (s, r); \ diff -Nru linux/arch/mips/math-emu/dp_sqrt.c.orig linux/arch/mips/math-emu/dp_sqrt.c --- linux/arch/mips/math-emu/dp_sqrt.c.orig Thu Feb 21 19:41:09 2002 +++ linux/arch/mips/math-emu/dp_sqrt.c Thu Feb 21 19:39:08 2002 @@ -37,7 +37,7 @@ ieee754dp ieee754dp_sqrt(ieee754dp x) { - struct ieee754_csr oldcsr; + struct ieee754_csr_struct oldcsr; ieee754dp y, z, t; unsigned scalx, yh; COMPXDP; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ieee754_csr is the problem (Re: lazy fpu switch irrelavant to no-fpu case? @ 2002-02-22 9:59 ` Kevin D. Kissell 0 siblings, 0 replies; 9+ messages in thread From: Kevin D. Kissell @ 2002-02-22 9:59 UTC (permalink / raw) To: Jun Sun, linux-mips This is what I get for processing my mail in-order. I just got done writing a message asking if the ieee_754_csr issue might be at the root of your problem. Anyway, rather than create an array of the damned things, I would think that the "best" thing to do would be to merge the "abstract" IEEE CSR with the simulated MIPS CSR (by adding the "noq" and "nod" bits in otherwise unused/reserved bit positions), and using the thread-local CSR copy for all of the ieee_754_csr manipulations, much as I did for the FP registers. That would be a bit more intrusive than your proposed hack, however, and only slightly more efficient. Kevin K. ----- Original Message ----- From: "Jun Sun" <jsun@mvista.com> To: <linux-mips@oss.sgi.com> Sent: Friday, February 22, 2002 4:57 AM Subject: ieee754_csr is the problem (Re: lazy fpu switch irrelavant to no-fpu case? > Jun Sun wrote: > > Anyhow, the problem I am seeing with FPU/SMP case seems to be caused by FPU > > emulation code itself, if we can assume it is not caused by fpu context > > switch. Right now the FPU is not turned on on the box. > > > > OK, I found the guilt part in FPU emul. It is the global variable > ieee754_csr. The following patch seems to fix the problem. I am sure someone > who are more familiar with FPU might be able to make it more elegant. > > There is another global variable which is potentially dangerous for SMP. It > is fpuemuprivate. Currently it is used in almost used for accounting and > read-only purpose. I did not bother to change it. It should be fixed too, I > suppose. > > Cheers. > > Jun ---------------------------------------------------------------------------- ---- > diff -Nru linux/arch/mips/math-emu/ieee754.h.orig linux/arch/mips/math-emu/ieee754.h > --- linux/arch/mips/math-emu/ieee754.h.orig Thu Jan 31 17:13:26 2002 > +++ linux/arch/mips/math-emu/ieee754.h Thu Feb 21 19:34:06 2002 > @@ -323,7 +323,7 @@ > > /* the control status register > */ > -struct ieee754_csr { > +struct ieee754_csr_struct { > unsigned pad:13; > unsigned nod:1; /* set 1 for no denormalised numbers */ > unsigned cx:5; /* exceptions this operation */ > @@ -331,7 +331,13 @@ > unsigned sx:5; /* exceptions total */ > unsigned rm:2; /* current rounding mode */ > }; > -extern struct ieee754_csr ieee754_csr; > + > +#include <linux/sched.h> > +#include <linux/threads.h> > +#include <linux/smp.h> > +#include <asm/current.h> > +extern struct ieee754_csr_struct ieee754_csr_array[NR_CPUS]; > +#define ieee754_csr ieee754_csr_array[smp_processor_id()] > > static __inline unsigned ieee754_getrm(void) > { > diff -Nru linux/arch/mips/math-emu/ieee754.c.orig linux/arch/mips/math-emu/ieee754.c > --- linux/arch/mips/math-emu/ieee754.c.orig Mon Jan 28 11:17:14 2002 > +++ linux/arch/mips/math-emu/ieee754.c Thu Feb 21 19:37:32 2002 > @@ -52,7 +52,7 @@ > > /* the control status register > */ > -struct ieee754_csr ieee754_csr; > +struct ieee754_csr_struct ieee754_csr_array[NR_CPUS]; > > /* special constants > */ > diff -Nru linux/arch/mips/math-emu/cp1emu.c.orig linux/arch/mips/math-emu/cp1emu.c > --- linux/arch/mips/math-emu/cp1emu.c.orig Mon Jan 28 11:17:14 2002 > +++ linux/arch/mips/math-emu/cp1emu.c Thu Feb 21 19:22:45 2002 > @@ -945,7 +945,7 @@ > static ieee754##p fpemu_##p##_##name (ieee754##p r, ieee754##p s, \ > ieee754##p t) \ > { \ > - struct ieee754_csr ieee754_csr_save; \ > + struct ieee754_csr_struct ieee754_csr_save; \ > s = f1 (s, t); \ > ieee754_csr_save = ieee754_csr; \ > s = f2 (s, r); \ > diff -Nru linux/arch/mips/math-emu/dp_sqrt.c.orig linux/arch/mips/math-emu/dp_sqrt.c > --- linux/arch/mips/math-emu/dp_sqrt.c.orig Thu Feb 21 19:41:09 2002 > +++ linux/arch/mips/math-emu/dp_sqrt.c Thu Feb 21 19:39:08 2002 > @@ -37,7 +37,7 @@ > > ieee754dp ieee754dp_sqrt(ieee754dp x) > { > - struct ieee754_csr oldcsr; > + struct ieee754_csr_struct oldcsr; > ieee754dp y, z, t; > unsigned scalx, yh; > COMPXDP; > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ieee754_csr is the problem (Re: lazy fpu switch irrelavant to no-fpu case? @ 2002-02-22 9:59 ` Kevin D. Kissell 0 siblings, 0 replies; 9+ messages in thread From: Kevin D. Kissell @ 2002-02-22 9:59 UTC (permalink / raw) To: Jun Sun, linux-mips This is what I get for processing my mail in-order. I just got done writing a message asking if the ieee_754_csr issue might be at the root of your problem. Anyway, rather than create an array of the damned things, I would think that the "best" thing to do would be to merge the "abstract" IEEE CSR with the simulated MIPS CSR (by adding the "noq" and "nod" bits in otherwise unused/reserved bit positions), and using the thread-local CSR copy for all of the ieee_754_csr manipulations, much as I did for the FP registers. That would be a bit more intrusive than your proposed hack, however, and only slightly more efficient. Kevin K. ----- Original Message ----- From: "Jun Sun" <jsun@mvista.com> To: <linux-mips@oss.sgi.com> Sent: Friday, February 22, 2002 4:57 AM Subject: ieee754_csr is the problem (Re: lazy fpu switch irrelavant to no-fpu case? > Jun Sun wrote: > > Anyhow, the problem I am seeing with FPU/SMP case seems to be caused by FPU > > emulation code itself, if we can assume it is not caused by fpu context > > switch. Right now the FPU is not turned on on the box. > > > > OK, I found the guilt part in FPU emul. It is the global variable > ieee754_csr. The following patch seems to fix the problem. I am sure someone > who are more familiar with FPU might be able to make it more elegant. > > There is another global variable which is potentially dangerous for SMP. It > is fpuemuprivate. Currently it is used in almost used for accounting and > read-only purpose. I did not bother to change it. It should be fixed too, I > suppose. > > Cheers. > > Jun ---------------------------------------------------------------------------- ---- > diff -Nru linux/arch/mips/math-emu/ieee754.h.orig linux/arch/mips/math-emu/ieee754.h > --- linux/arch/mips/math-emu/ieee754.h.orig Thu Jan 31 17:13:26 2002 > +++ linux/arch/mips/math-emu/ieee754.h Thu Feb 21 19:34:06 2002 > @@ -323,7 +323,7 @@ > > /* the control status register > */ > -struct ieee754_csr { > +struct ieee754_csr_struct { > unsigned pad:13; > unsigned nod:1; /* set 1 for no denormalised numbers */ > unsigned cx:5; /* exceptions this operation */ > @@ -331,7 +331,13 @@ > unsigned sx:5; /* exceptions total */ > unsigned rm:2; /* current rounding mode */ > }; > -extern struct ieee754_csr ieee754_csr; > + > +#include <linux/sched.h> > +#include <linux/threads.h> > +#include <linux/smp.h> > +#include <asm/current.h> > +extern struct ieee754_csr_struct ieee754_csr_array[NR_CPUS]; > +#define ieee754_csr ieee754_csr_array[smp_processor_id()] > > static __inline unsigned ieee754_getrm(void) > { > diff -Nru linux/arch/mips/math-emu/ieee754.c.orig linux/arch/mips/math-emu/ieee754.c > --- linux/arch/mips/math-emu/ieee754.c.orig Mon Jan 28 11:17:14 2002 > +++ linux/arch/mips/math-emu/ieee754.c Thu Feb 21 19:37:32 2002 > @@ -52,7 +52,7 @@ > > /* the control status register > */ > -struct ieee754_csr ieee754_csr; > +struct ieee754_csr_struct ieee754_csr_array[NR_CPUS]; > > /* special constants > */ > diff -Nru linux/arch/mips/math-emu/cp1emu.c.orig linux/arch/mips/math-emu/cp1emu.c > --- linux/arch/mips/math-emu/cp1emu.c.orig Mon Jan 28 11:17:14 2002 > +++ linux/arch/mips/math-emu/cp1emu.c Thu Feb 21 19:22:45 2002 > @@ -945,7 +945,7 @@ > static ieee754##p fpemu_##p##_##name (ieee754##p r, ieee754##p s, \ > ieee754##p t) \ > { \ > - struct ieee754_csr ieee754_csr_save; \ > + struct ieee754_csr_struct ieee754_csr_save; \ > s = f1 (s, t); \ > ieee754_csr_save = ieee754_csr; \ > s = f2 (s, r); \ > diff -Nru linux/arch/mips/math-emu/dp_sqrt.c.orig linux/arch/mips/math-emu/dp_sqrt.c > --- linux/arch/mips/math-emu/dp_sqrt.c.orig Thu Feb 21 19:41:09 2002 > +++ linux/arch/mips/math-emu/dp_sqrt.c Thu Feb 21 19:39:08 2002 > @@ -37,7 +37,7 @@ > > ieee754dp ieee754dp_sqrt(ieee754dp x) > { > - struct ieee754_csr oldcsr; > + struct ieee754_csr_struct oldcsr; > ieee754dp y, z, t; > unsigned scalx, yh; > COMPXDP; > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ieee754_csr is the problem (Re: lazy fpu switch irrelavant to no-fpu case? @ 2002-02-22 17:08 ` Kjeld Borch Egevang 0 siblings, 0 replies; 9+ messages in thread From: Kjeld Borch Egevang @ 2002-02-22 17:08 UTC (permalink / raw) To: linux-mips In mips.test, you wrote: >This is what I get for processing my mail in-order. >I just got done writing a message asking if the >ieee_754_csr issue might be at the root of your >problem. > >Anyway, rather than create an array of the damned >things, I would think that the "best" thing to do would >be to merge the "abstract" IEEE CSR with the >simulated MIPS CSR (by adding the "noq" and >"nod" bits in otherwise unused/reserved bit positions), >and using the thread-local CSR copy for all of the >ieee_754_csr manipulations, much as I did for >the FP registers. That would be a bit more intrusive >than your proposed hack, however, and only slightly >more efficient. I've been wondering: Why was the CSR copy made in the first place? /Kjeld -- _ _ ____ ___ Mailto:kjelde@mips.com |\ /|||___)(___ MIPS Denmark Direct: +45 44 86 55 85 | \/ ||| ____) Lautrupvang 4 B Switch: +45 44 86 55 55 TECHNOLOGIES DK-2750 Ballerup Fax...: +45 44 86 55 56 Denmark http://www.mips.com/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ieee754_csr is the problem (Re: lazy fpu switch irrelavant to no-fpu case? @ 2002-02-22 17:08 ` Kjeld Borch Egevang 0 siblings, 0 replies; 9+ messages in thread From: Kjeld Borch Egevang @ 2002-02-22 17:08 UTC (permalink / raw) To: linux-mips In mips.test, you wrote: >This is what I get for processing my mail in-order. >I just got done writing a message asking if the >ieee_754_csr issue might be at the root of your >problem. > >Anyway, rather than create an array of the damned >things, I would think that the "best" thing to do would >be to merge the "abstract" IEEE CSR with the >simulated MIPS CSR (by adding the "noq" and >"nod" bits in otherwise unused/reserved bit positions), >and using the thread-local CSR copy for all of the >ieee_754_csr manipulations, much as I did for >the FP registers. That would be a bit more intrusive >than your proposed hack, however, and only slightly >more efficient. I've been wondering: Why was the CSR copy made in the first place? /Kjeld -- _ _ ____ ___ Mailto:kjelde@mips.com |\ /|||___)(___ MIPS Denmark Direct: +45 44 86 55 85 | \/ ||| ____) Lautrupvang 4 B Switch: +45 44 86 55 55 TECHNOLOGIES DK-2750 Ballerup Fax...: +45 44 86 55 56 Denmark http://www.mips.com/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: lazy fpu switch irrelavant to no-fpu case? @ 2002-02-22 9:45 ` Kevin D. Kissell 0 siblings, 0 replies; 9+ messages in thread From: Kevin D. Kissell @ 2002-02-22 9:45 UTC (permalink / raw) To: Jun Sun, linux-mips In the very first cut at integrating the Algorithmics emulator with Linux, the emulator actually contained storage that represented the FPU registers, and FP context management was meaningful. Using thread context storage directly for the FPU registers was an optimisaton that I did after I got the code running, and I didn't bother eliminating the last_task_used_math setup, probably on the basis that it wasn't costing much and that it might still be useful in some way. I don't think you'll break anything by getting rid of it, but I don't think you'll fix anything either. As I stated in another message on the subject of SMP problems observed with the FPU emulator, while the basic mechanisms of FP emulation should be SMP safe, there may well be non-SMP artifcacts in the code. A cursory inspection shows that there is a single mips_fpu_emulator_private data structure for the emulator, which contains statistics which risk being screwed up due to non-atomic increments being used. That ought to be fixed, but should not cause any user-mode-visible problems. But I also note that the emulator uses a single global storage location for "ieee754_csr". The kernel port of the code does copies between the thread context image of the MIPS csr and this global which are manifestly SMP unsafe. Could the bugs you're seeing be explainable by corruption of rounding mode and exception state? Regards, Kevin K. ----- Original Message ----- From: "Jun Sun" <jsun@mvista.com> To: <linux-mips@oss.sgi.com> Sent: Friday, February 22, 2002 3:48 AM Subject: lazy fpu switch irrelavant to no-fpu case? > > It appears to me that lazy fpu switch has no relevancy to CPUs that don't have > FPU. > > If you do a scan, you will see last_task_used_math are used in four kernel > files: > > ptrace.c > process.c > signal.c > traps.c > > In the case of ptrace.c and process.c, the variable is used only when CPU has > FPU. > > In the case of traps.c (do_cpu()), it used redaundantly with another condition > checking. > > In the case of signal.c, no matter what last_task_used_math is, the same code > will be executed anyway. > > Now think about it, it actually makes sense - if we don't have hardware FPU, > why do we care of fpu context switch. > > Anyhow, the problem I am seeing with FPU/SMP case seems to be caused by FPU > emulation code itself, if we can assume it is not caused by fpu context > switch. Right now the FPU is not turned on on the box. > > The following patch cleans it up a little based on the above observation. > Make sense? > > Jun > > diff -Nru linux/arch/mips/kernel/traps.c.orig linux/arch/mips/kernel/traps.c > --- linux/arch/mips/kernel/traps.c.orig Wed Jan 30 15:17:12 2002 > +++ linux/arch/mips/kernel/traps.c Thu Feb 21 18:46:28 2002 > @@ -678,14 +678,11 @@ > return; > > fp_emul: > - if (last_task_used_math != current) { > - if (!current->used_math) { > - fpu_emulator_init_fpu(); > - current->used_math = 1; > - } > + if (!current->used_math) { > + fpu_emulator_init_fpu(); > + current->used_math = 1; > } > sig = fpu_emulator_cop1Handler(regs); > - last_task_used_math = current; > if (sig) > force_sig(sig, current); > return; > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: lazy fpu switch irrelavant to no-fpu case? @ 2002-02-22 9:45 ` Kevin D. Kissell 0 siblings, 0 replies; 9+ messages in thread From: Kevin D. Kissell @ 2002-02-22 9:45 UTC (permalink / raw) To: Jun Sun, linux-mips In the very first cut at integrating the Algorithmics emulator with Linux, the emulator actually contained storage that represented the FPU registers, and FP context management was meaningful. Using thread context storage directly for the FPU registers was an optimisaton that I did after I got the code running, and I didn't bother eliminating the last_task_used_math setup, probably on the basis that it wasn't costing much and that it might still be useful in some way. I don't think you'll break anything by getting rid of it, but I don't think you'll fix anything either. As I stated in another message on the subject of SMP problems observed with the FPU emulator, while the basic mechanisms of FP emulation should be SMP safe, there may well be non-SMP artifcacts in the code. A cursory inspection shows that there is a single mips_fpu_emulator_private data structure for the emulator, which contains statistics which risk being screwed up due to non-atomic increments being used. That ought to be fixed, but should not cause any user-mode-visible problems. But I also note that the emulator uses a single global storage location for "ieee754_csr". The kernel port of the code does copies between the thread context image of the MIPS csr and this global which are manifestly SMP unsafe. Could the bugs you're seeing be explainable by corruption of rounding mode and exception state? Regards, Kevin K. ----- Original Message ----- From: "Jun Sun" <jsun@mvista.com> To: <linux-mips@oss.sgi.com> Sent: Friday, February 22, 2002 3:48 AM Subject: lazy fpu switch irrelavant to no-fpu case? > > It appears to me that lazy fpu switch has no relevancy to CPUs that don't have > FPU. > > If you do a scan, you will see last_task_used_math are used in four kernel > files: > > ptrace.c > process.c > signal.c > traps.c > > In the case of ptrace.c and process.c, the variable is used only when CPU has > FPU. > > In the case of traps.c (do_cpu()), it used redaundantly with another condition > checking. > > In the case of signal.c, no matter what last_task_used_math is, the same code > will be executed anyway. > > Now think about it, it actually makes sense - if we don't have hardware FPU, > why do we care of fpu context switch. > > Anyhow, the problem I am seeing with FPU/SMP case seems to be caused by FPU > emulation code itself, if we can assume it is not caused by fpu context > switch. Right now the FPU is not turned on on the box. > > The following patch cleans it up a little based on the above observation. > Make sense? > > Jun > > diff -Nru linux/arch/mips/kernel/traps.c.orig linux/arch/mips/kernel/traps.c > --- linux/arch/mips/kernel/traps.c.orig Wed Jan 30 15:17:12 2002 > +++ linux/arch/mips/kernel/traps.c Thu Feb 21 18:46:28 2002 > @@ -678,14 +678,11 @@ > return; > > fp_emul: > - if (last_task_used_math != current) { > - if (!current->used_math) { > - fpu_emulator_init_fpu(); > - current->used_math = 1; > - } > + if (!current->used_math) { > + fpu_emulator_init_fpu(); > + current->used_math = 1; > } > sig = fpu_emulator_cop1Handler(regs); > - last_task_used_math = current; > if (sig) > force_sig(sig, current); > return; > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2002-02-22 18:09 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-02-22 2:48 lazy fpu switch irrelavant to no-fpu case? Jun Sun 2002-02-22 3:57 ` ieee754_csr is the problem (Re: " Jun Sun 2002-02-22 3:57 ` Jun Sun 2002-02-22 9:59 ` Kevin D. Kissell 2002-02-22 9:59 ` Kevin D. Kissell 2002-02-22 17:08 ` Kjeld Borch Egevang 2002-02-22 17:08 ` Kjeld Borch Egevang 2002-02-22 9:45 ` Kevin D. Kissell 2002-02-22 9:45 ` Kevin D. Kissell
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.