* floating point exception
@ 2002-01-13 12:43 Christian Thalinger
2002-01-15 23:28 ` Brian Gerst
0 siblings, 1 reply; 27+ messages in thread
From: Christian Thalinger @ 2002-01-13 12:43 UTC (permalink / raw)
To: linux-kernel
Hi!
Just downloaded again, after a long time, the setiathome client. I
wanted to look how smooth my tyan dual works. So i started the client
and after a few seconds it gets and `floating point exception'. No
problem till now, cause it seems to be seti bug. Ok.
Right after that my window manager segfaults. Ok, switch to console,
restart it and go. No! Can't start any programs anymore, no login. All
tasks die one after the other, up to the complete lock of the machine.
Even alt-sysrq doesn't work.
So, this is kernel 2.4.17 and i'll try other kernels right after this
email.
Anyone knows what's going on?
^ permalink raw reply [flat|nested] 27+ messages in thread
* floating point exception
@ 2002-01-14 10:56 Zwane Mwaikambo
2002-01-14 21:26 ` Christian Thalinger
0 siblings, 1 reply; 27+ messages in thread
From: Zwane Mwaikambo @ 2002-01-14 10:56 UTC (permalink / raw)
To: Linux Kernel
>Right after that my window manager segfaults. Ok, switch to console,
>restart it and go. No! Can't start any programs anymore, no login. All
>tasks die one after the other, up to the complete lock of the machine.
>Even alt-sysrq doesn't work.
Can you reproduce the problem with some degree of success? (2/5 is fine)
Regards,
Zwane Mwaikambo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-14 10:56 Zwane Mwaikambo
@ 2002-01-14 21:26 ` Christian Thalinger
2002-01-15 14:34 ` Zwane Mwaikambo
0 siblings, 1 reply; 27+ messages in thread
From: Christian Thalinger @ 2002-01-14 21:26 UTC (permalink / raw)
To: Zwane Mwaikambo; +Cc: Linux Kernel
On Mon, 2002-01-14 at 11:56, Zwane Mwaikambo wrote:
> >Right after that my window manager segfaults. Ok, switch to console,
> >restart it and go. No! Can't start any programs anymore, no login. All
> >tasks die one after the other, up to the complete lock of the machine.
> >Even alt-sysrq doesn't work.
>
> Can you reproduce the problem with some degree of success? (2/5 is fine)
>
> Regards,
> Zwane Mwaikambo
>
After a little bit of testing i would say yes. 2-3 out of 5 with kernel
2.4.17 and 2.4.18-pre3. Mainly with X, got some without X.
It seems the floating point exception is only raised with a new data
package. Is there a simple way to raise such a exception?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-14 21:26 ` Christian Thalinger
@ 2002-01-15 14:34 ` Zwane Mwaikambo
2002-01-15 14:46 ` Richard B. Johnson
2002-01-15 18:19 ` Christian Thalinger
0 siblings, 2 replies; 27+ messages in thread
From: Zwane Mwaikambo @ 2002-01-15 14:34 UTC (permalink / raw)
To: Christian Thalinger; +Cc: Linux Kernel
On 14 Jan 2002, Christian Thalinger wrote:
> It seems the floating point exception is only raised with a new data
> package. Is there a simple way to raise such a exception?
New data package? And does the same behaviour re-occur after the fpu
exception? ie programs start segfaulting etc. Can you try doing a "dmesg"
after the segfaults and fpu exception and see if there is anything in the
kernel ring buffer too.
Regards,
Zwane Mwaikambo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-15 14:34 ` Zwane Mwaikambo
@ 2002-01-15 14:46 ` Richard B. Johnson
2002-01-15 18:19 ` Christian Thalinger
1 sibling, 0 replies; 27+ messages in thread
From: Richard B. Johnson @ 2002-01-15 14:46 UTC (permalink / raw)
To: Zwane Mwaikambo; +Cc: Christian Thalinger, Linux Kernel
On Tue, 15 Jan 2002, Zwane Mwaikambo wrote:
> On 14 Jan 2002, Christian Thalinger wrote:
>
> > It seems the floating point exception is only raised with a new data
> > package. Is there a simple way to raise such a exception?
>
> New data package? And does the same behaviour re-occur after the fpu
> exception? ie programs start segfaulting etc. Can you try doing a "dmesg"
> after the segfaults and fpu exception and see if there is anything in the
> kernel ring buffer too.
>
> Regards,
> Zwane Mwaikambo
This will allow you to generate some math-errors and see if everything
works okay. By default, upon process creation, math errors like
/0 are masked.
/*
* Note FPU control only exists per process. Therefore, you have
* to set up the FPU before you use it in any program.
*/
#include <i386/fpu_control.h>
#define FPU_MASK (_FPU_MASK_IM |\
_FPU_MASK_DM |\
_FPU_MASK_ZM |\
_FPU_MASK_OM |\
_FPU_MASK_UM |\
_FPU_MASK_PM)
void fpu()
{
__setfpucw(_FPU_DEFAULT & ~FPU_MASK);
}
main() {
double zero=0.0;
double one=1.0;
fpu();
one /=zero;
}
Cheers,
Dick Johnson
Penguin : Linux version 2.4.1 on an i686 machine (797.90 BogoMips).
I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-15 14:34 ` Zwane Mwaikambo
2002-01-15 14:46 ` Richard B. Johnson
@ 2002-01-15 18:19 ` Christian Thalinger
2002-01-15 18:31 ` Richard B. Johnson
2002-01-16 5:45 ` Zwane Mwaikambo
1 sibling, 2 replies; 27+ messages in thread
From: Christian Thalinger @ 2002-01-15 18:19 UTC (permalink / raw)
To: Zwane Mwaikambo; +Cc: linux-kernel, Richard B. Johnson
On Tue, 2002-01-15 at 15:34, Zwane Mwaikambo wrote:
> On 14 Jan 2002, Christian Thalinger wrote:
>
> > It seems the floating point exception is only raised with a new data
> > package. Is there a simple way to raise such a exception?
>
> New data package? And does the same behaviour re-occur after the fpu
> exception? ie programs start segfaulting etc. Can you try doing a "dmesg"
> after the segfaults and fpu exception and see if there is anything in the
> kernel ring buffer too.
>
> Regards,
> Zwane Mwaikambo
>
There are .sah files, in which the data is stored to analyse. So i
deleted these files and the client downloads a new package -> new data
package.
Yes, it did happen that the segfault reoccured and there is nothing in
the dmesg. This was also my first thought, then checked
/var/log/messages with a tail and it stucked. No ctrl-c.
Tried this:
#define _GNU_SOURCE 1
#include <fenv.h>
main() {
double zero=0.0;
double one=1.0;
feenableexcept(FE_ALL_EXCEPT);
one /=zero;
}
...but nothing happens.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-15 18:19 ` Christian Thalinger
@ 2002-01-15 18:31 ` Richard B. Johnson
2002-01-15 18:49 ` Christian Thalinger
2002-01-16 5:45 ` Zwane Mwaikambo
1 sibling, 1 reply; 27+ messages in thread
From: Richard B. Johnson @ 2002-01-15 18:31 UTC (permalink / raw)
To: Christian Thalinger; +Cc: Zwane Mwaikambo, linux-kernel
On 15 Jan 2002, Christian Thalinger wrote:
> On Tue, 2002-01-15 at 15:34, Zwane Mwaikambo wrote:
> > On 14 Jan 2002, Christian Thalinger wrote:
[SNIPPED...]
>
> Tried this:
>
> #define _GNU_SOURCE 1
> #include <fenv.h>
>
> main() {
> double zero=0.0;
> double one=1.0;
>
> feenableexcept(FE_ALL_EXCEPT);
>
> one /=zero;
> }
>
Well, that won't even link. The source I showed previously
compiles and link fine. It also shows a FPU exception when
one divides by zero:
Script started on Tue Jan 15 13:27:05 2002
# gcc -o zzz zzz.c -lm
/tmp/ccjhyGHj.o: In function `main':
/tmp/ccjhyGHj.o(.text+0x25): undefined reference to `feenableexcept'
collect2: ld returned 1 exit status
# gcc -o zzz fpu.c
# zzz
Floating point exception (core dumped)
# cat fpu.c
/*
* Note FPU control only exists per process. Therefore, you have
* to set up the FPU before you use it in any program.
*/
#include <i386/fpu_control.h>
#define FPU_MASK (_FPU_MASK_IM |\
_FPU_MASK_DM |\
_FPU_MASK_ZM |\
_FPU_MASK_OM |\
_FPU_MASK_UM |\
_FPU_MASK_PM)
void fpu()
{
__setfpucw(_FPU_DEFAULT & ~FPU_MASK);
}
main() {
double zero=0.0;
double one=1.0;
fpu();
one /=zero;
}
# cat zzz.c
#define _GNU_SOURCE 1
#include <fenv.h>
main() {
double zero=0.0;
double one=1.0;
feenableexcept(FE_ALL_EXCEPT);
one /=zero;
}
You have new mail in /var/spool/mail/root
# exit
exit
Script done on Tue Jan 15 13:28:32 2002
Cheers,
Dick Johnson
Penguin : Linux version 2.4.1 on an i686 machine (797.90 BogoMips).
I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-15 18:31 ` Richard B. Johnson
@ 2002-01-15 18:49 ` Christian Thalinger
0 siblings, 0 replies; 27+ messages in thread
From: Christian Thalinger @ 2002-01-15 18:49 UTC (permalink / raw)
To: Richard B. Johnson; +Cc: Zwane Mwaikambo, linux-kernel
On Tue, 2002-01-15 at 19:31, Richard B. Johnson wrote:
> On 15 Jan 2002, Christian Thalinger wrote:
>
> > On Tue, 2002-01-15 at 15:34, Zwane Mwaikambo wrote:
> > > On 14 Jan 2002, Christian Thalinger wrote:
> [SNIPPED...]
>
> >
> > Tried this:
> >
> > #define _GNU_SOURCE 1
> > #include <fenv.h>
> >
> > main() {
> > double zero=0.0;
> > double one=1.0;
> >
> > feenableexcept(FE_ALL_EXCEPT);
> >
> > one /=zero;
> > }
> >
> Well, that won't even link. The source I showed previously
> compiles and link fine. It also shows a FPU exception when
> one divides by zero:
>
> Script started on Tue Jan 15 13:27:05 2002
> # gcc -o zzz zzz.c -lm
> /tmp/ccjhyGHj.o: In function `main':
> /tmp/ccjhyGHj.o(.text+0x25): undefined reference to `feenableexcept'
> collect2: ld returned 1 exit status
This depends on the libc version. Seems you have 2.1. For me it's 2.2.
[root@sector17:/root/src]# cat fpu-exception.c
#define _GNU_SOURCE 1
#include <fenv.h>
main() {
double zero=0.0;
double one=1.0;
feenableexcept(FE_ALL_EXCEPT);
one /=zero;
}
[root@sector17:/root/src]# gcc -Wall -lm -o fpu-exception
fpu-exception.c
fpu-exception.c:4: warning: return type defaults to `int'
fpu-exception.c: In function `main':
fpu-exception.c:11: warning: control reaches end of non-void function
[root@sector17:/root/src]# ./fpu-exception
Floating point exception
[root@sector17:/root/src]#
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-13 12:43 floating point exception Christian Thalinger
@ 2002-01-15 23:28 ` Brian Gerst
2002-01-16 11:45 ` Christian Thalinger
0 siblings, 1 reply; 27+ messages in thread
From: Brian Gerst @ 2002-01-15 23:28 UTC (permalink / raw)
To: Christian Thalinger; +Cc: linux-kernel
Christian Thalinger wrote:
>
> Hi!
>
> Just downloaded again, after a long time, the setiathome client. I
> wanted to look how smooth my tyan dual works. So i started the client
> and after a few seconds it gets and `floating point exception'. No
> problem till now, cause it seems to be seti bug. Ok.
>
> Right after that my window manager segfaults. Ok, switch to console,
> restart it and go. No! Can't start any programs anymore, no login. All
> tasks die one after the other, up to the complete lock of the machine.
> Even alt-sysrq doesn't work.
>
> So, this is kernel 2.4.17 and i'll try other kernels right after this
> email.
>
> Anyone knows what's going on?
What CPU do you have? Do you have the FPU emulator compiled in? Are
there any oops messages?
--
Brian Gerst
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-15 18:19 ` Christian Thalinger
2002-01-15 18:31 ` Richard B. Johnson
@ 2002-01-16 5:45 ` Zwane Mwaikambo
2002-01-16 11:55 ` Christian Thalinger
1 sibling, 1 reply; 27+ messages in thread
From: Zwane Mwaikambo @ 2002-01-16 5:45 UTC (permalink / raw)
To: Christian Thalinger; +Cc: linux-kernel, Richard B. Johnson
On 15 Jan 2002, Christian Thalinger wrote:
> Yes, it did happen that the segfault reoccured and there is nothing in
> the dmesg. This was also my first thought, then checked
> /var/log/messages with a tail and it stucked. No ctrl-c.
ctrl-alt-sysrq k? I'd just like to know wether your box hung completely.
Could you also run the ver_linux script in linux_scripts so that we can
get a better idea of your operating environment.
Cheers,
Zwane Mwaikambo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-15 23:28 ` Brian Gerst
@ 2002-01-16 11:45 ` Christian Thalinger
2002-01-16 11:58 ` Dave Jones
2002-01-16 13:52 ` Brian Gerst
0 siblings, 2 replies; 27+ messages in thread
From: Christian Thalinger @ 2002-01-16 11:45 UTC (permalink / raw)
To: Brian Gerst; +Cc: linux-kernel
On Wed, 2002-01-16 at 00:28, Brian Gerst wrote:
> What CPU do you have? Do you have the FPU emulator compiled in? Are
> there any oops messages?
>
> --
> Brian Gerst
>
I mentioned in my first mail the dual tyan, so athlon xp, no fpu
emulator ;-) and no oops messages.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 5:45 ` Zwane Mwaikambo
@ 2002-01-16 11:55 ` Christian Thalinger
2002-01-16 14:32 ` Zwane Mwaikambo
0 siblings, 1 reply; 27+ messages in thread
From: Christian Thalinger @ 2002-01-16 11:55 UTC (permalink / raw)
To: Zwane Mwaikambo; +Cc: linux-kernel, Richard B. Johnson
On Wed, 2002-01-16 at 06:45, Zwane Mwaikambo wrote:
> On 15 Jan 2002, Christian Thalinger wrote:
>
> > Yes, it did happen that the segfault reoccured and there is nothing in
> > the dmesg. This was also my first thought, then checked
> > /var/log/messages with a tail and it stucked. No ctrl-c.
>
> ctrl-alt-sysrq k? I'd just like to know wether your box hung completely.
> Could you also run the ver_linux script in linux_scripts so that we can
> get a better idea of your operating environment.
>
> Cheers,
> Zwane Mwaikambo
>
>
What i got at my last exception (started the client in tty1):
Listened to an mp3 with mpg123. After the exception the mp3 got in the
_he_my_system_is_completely_locked loop. Couldn't kill the process.
System was respondable, console switching was ok. Changed to console to
tty2 where X was running - crtl-c - X went down -> console switching
wasn't possible anymore.
ctrl-alt-sysrq was responding but only with the line:
SysRq : Enmergency sync
SysRq : .... (tried also the other ones)
but nothing happend. No syncing, no unmount and showtasks. Right now i
noticed that showTasks, mem and pc do not give _any_ output, but syncing
works.
I'll do further testing when i'm back from work.
Gnu C 3.0.3
Gnu make 3.79.1
util-linux 2.11m
mount 2.11h
modutils 2.4.11
e2fsprogs 1.25
reiserfsprogs 3.x.0b
Linux C Library 2.2.4
Dynamic linker (ldd) 2.2.4
Linux C++ Library 3.0.2
Procps 2.0.7
Net-tools 1.60
Console-tools 0.2.3
Sh-utils 2.0.11
Modules Loaded NVdriver sym53c8xx scsi_mod pwcx-i386 pwc rio500
usb-ohci
usbcore w83781d eeprom i2c-proc i2c-amd756 i2c-isa binfmt_misc
binfmt_aout ospm
_processor ospm_system ospm_busmgr sercontrol lirc_i2c lirc_dev tuner
tvaudio ms
p3400 bttv videodev i2c-algo-bit i2c-core nfsd lockd sunrpc parport_pc
lp parpor
t via-rhine emu10k1 sound ac97_codec soundcore rtc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 11:45 ` Christian Thalinger
@ 2002-01-16 11:58 ` Dave Jones
2002-01-16 13:14 ` Bruce Harada
2002-01-16 13:52 ` Brian Gerst
1 sibling, 1 reply; 27+ messages in thread
From: Dave Jones @ 2002-01-16 11:58 UTC (permalink / raw)
To: Christian Thalinger; +Cc: Brian Gerst, linux-kernel
On 16 Jan 2002, Christian Thalinger wrote:
> I mentioned in my first mail the dual tyan, so athlon xp, no fpu
> emulator ;-) and no oops messages.
Dual Athlon XP problem. Thanks for playing.
--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 11:58 ` Dave Jones
@ 2002-01-16 13:14 ` Bruce Harada
2002-01-16 20:06 ` Christian Thalinger
2002-01-17 19:26 ` bill davidsen
0 siblings, 2 replies; 27+ messages in thread
From: Bruce Harada @ 2002-01-16 13:14 UTC (permalink / raw)
To: Dave Jones; +Cc: e9625286, linux-kernel
On Wed, 16 Jan 2002 12:58:35 +0100 (CET)
Dave Jones <davej@suse.de> wrote:
> On 16 Jan 2002, Christian Thalinger wrote:
>
> > I mentioned in my first mail the dual tyan, so athlon xp, no fpu
> > emulator ;-) and no oops messages.
>
> Dual Athlon XP problem. Thanks for playing.
Interesting. That's the first actual report I've seen of problems caused by
using XPs instead of MPs. I'd been wondering if I could get away with XPs for
my next SMP box; now I know better ;)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 11:45 ` Christian Thalinger
2002-01-16 11:58 ` Dave Jones
@ 2002-01-16 13:52 ` Brian Gerst
2002-01-16 14:28 ` M. Edward (Ed) Borasky
1 sibling, 1 reply; 27+ messages in thread
From: Brian Gerst @ 2002-01-16 13:52 UTC (permalink / raw)
To: Christian Thalinger; +Cc: linux-kernel
Christian Thalinger wrote:
>
> On Wed, 2002-01-16 at 00:28, Brian Gerst wrote:
> > What CPU do you have? Do you have the FPU emulator compiled in? Are
> > there any oops messages?
> >
> > --
> > Brian Gerst
> >
>
> I mentioned in my first mail the dual tyan, so athlon xp, no fpu
> emulator ;-) and no oops messages.
Last I checked, Athlon XP's weren't certified for SMP, only MP's.
That's likely what the problem is. And for the record, Tyan also makes
Intel boards too.
Processor manufacturing 101: All processors of a given family come off
the same production line. Due to variations in the process, some
processors have defects that only show up at higher clock speeds, SMP
mode, etc. At the end of the line the processor is tested. If it fails
at higher clock speeds it is marked at a lower speed. If it fails SMP
it is marked as an XP. Market demand can also cause a chip to be rated
lower than it really is, so you can sometimes get away with
overclocking, etc. but it's just random luck if it really works.
--
Brian Gerst
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 13:52 ` Brian Gerst
@ 2002-01-16 14:28 ` M. Edward (Ed) Borasky
0 siblings, 0 replies; 27+ messages in thread
From: M. Edward (Ed) Borasky @ 2002-01-16 14:28 UTC (permalink / raw)
To: linux-kernel
On Wed, 16 Jan 2002, Brian Gerst wrote:
> Last I checked, Athlon XP's weren't certified for SMP, only MP's.
> That's likely what the problem is. And for the record, Tyan also
> makes Intel boards too.
>
> Processor manufacturing 101: All processors of a given family come
> off the same production line. Due to variations in the process, some
> processors have defects that only show up at higher clock speeds, SMP
> mode, etc. At the end of the line the processor is tested. If it
> fails at higher clock speeds it is marked at a lower speed. If it
> fails SMP it is marked as an XP. Market demand can also cause a chip
> to be rated lower than it really is, so you can sometimes get away
> with overclocking, etc. but it's just random luck if it really works.
Could you be more specific on this "random luck" bit? Let's say we have
a production line making processors that should *all* run SMP at, say,
1800 MHz. What fraction of them will actually run SMP and 1800? What
fraction of them will actually run at 1800 UP? What fraction of them
will run at 1700 SMP, 1600 SMP, etc.? And what fraction of them run at
1800 SMP at the end of the line but croak when they get stuck in
<ducking> Aunt Tillie's motherboard?
I'm not looking for anyone's proprietary yield statistics here -- just a
rough idea of what kind of distributions we're dealing with here. For my
application, the 1.3 GHz Athlon I've got now is overkill. I wanted a
dual when I got the system last March, but there weren't any
motherboards available that I could find.
--
M. Edward Borasky
znmeb@borasky-research.net
The COUGAR Project
http://www.borasky-research.com/Cougar.htm
Q. How do you tell when a pineapple is ready to eat?
A. It picks up its knife and fork.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 11:55 ` Christian Thalinger
@ 2002-01-16 14:32 ` Zwane Mwaikambo
2002-01-16 20:26 ` Christian Thalinger
0 siblings, 1 reply; 27+ messages in thread
From: Zwane Mwaikambo @ 2002-01-16 14:32 UTC (permalink / raw)
To: Christian Thalinger; +Cc: linux-kernel, Richard B. Johnson
On 16 Jan 2002, Christian Thalinger wrote:
> Gnu C 3.0.3
> Gnu make 3.79.1
> util-linux 2.11m
> mount 2.11h
> modutils 2.4.11
> e2fsprogs 1.25
> reiserfsprogs 3.x.0b
> Linux C Library 2.2.4
> Dynamic linker (ldd) 2.2.4
> Linux C++ Library 3.0.2
> Procps 2.0.7
> Net-tools 1.60
> Console-tools 0.2.3
> Sh-utils 2.0.11
> Modules Loaded NVdriver sym53c8xx scsi_mod pwcx-i386 pwc rio500
> usb-ohci
> usbcore w83781d eeprom i2c-proc i2c-amd756 i2c-isa binfmt_misc
> binfmt_aout ospm
> _processor ospm_system ospm_busmgr sercontrol lirc_i2c lirc_dev tuner
> tvaudio ms
> p3400 bttv videodev i2c-algo-bit i2c-core nfsd lockd sunrpc parport_pc
> lp parpor
> t via-rhine emu10k1 sound ac97_codec soundcore rtc
Can you also reproduce _without_ loading NVdriver, just to make everybody
happy.
Thanks,
Zwane Mwaikambo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 13:14 ` Bruce Harada
@ 2002-01-16 20:06 ` Christian Thalinger
2002-01-17 19:26 ` bill davidsen
1 sibling, 0 replies; 27+ messages in thread
From: Christian Thalinger @ 2002-01-16 20:06 UTC (permalink / raw)
To: Bruce Harada; +Cc: Dave Jones, linux-kernel
On Wed, 2002-01-16 at 14:14, Bruce Harada wrote:
> On Wed, 16 Jan 2002 12:58:35 +0100 (CET)
> Dave Jones <davej@suse.de> wrote:
>
> > On 16 Jan 2002, Christian Thalinger wrote:
> >
> > > I mentioned in my first mail the dual tyan, so athlon xp, no fpu
> > > emulator ;-) and no oops messages.
> >
> > Dual Athlon XP problem. Thanks for playing.
>
> Interesting. That's the first actual report I've seen of problems caused by
> using XPs instead of MPs. I'd been wondering if I could get away with XPs for
> my next SMP box; now I know better ;)
Don't be too scared. Everything works except this seti thingy.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 14:32 ` Zwane Mwaikambo
@ 2002-01-16 20:26 ` Christian Thalinger
2002-01-16 21:23 ` Richard B. Johnson
0 siblings, 1 reply; 27+ messages in thread
From: Christian Thalinger @ 2002-01-16 20:26 UTC (permalink / raw)
To: Zwane Mwaikambo; +Cc: linux-kernel, Richard B. Johnson, davej
On Wed, 2002-01-16 at 15:32, Zwane Mwaikambo wrote:
> Can you also reproduce _without_ loading NVdriver, just to make everybody
> happy.
>
> Thanks,
> Zwane Mwaikambo
>
Sure, same breakdown. Maybe it's really an dual athlon xp issue as dave
jones mentioned. But shouldn't this also occur when i trigger a floating
point exception myself? Is there a way to check which floating point
exception was raised by the seti client?
Regards.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 20:26 ` Christian Thalinger
@ 2002-01-16 21:23 ` Richard B. Johnson
2002-01-16 21:59 ` Brian Gerst
0 siblings, 1 reply; 27+ messages in thread
From: Richard B. Johnson @ 2002-01-16 21:23 UTC (permalink / raw)
To: Christian Thalinger; +Cc: Zwane Mwaikambo, linux-kernel, davej
On 16 Jan 2002, Christian Thalinger wrote:
> On Wed, 2002-01-16 at 15:32, Zwane Mwaikambo wrote:
> > Can you also reproduce _without_ loading NVdriver, just to make everybody
> > happy.
> >
> > Thanks,
> > Zwane Mwaikambo
> >
>
> Sure, same breakdown. Maybe it's really an dual athlon xp issue as dave
> jones mentioned. But shouldn't this also occur when i trigger a floating
> point exception myself? Is there a way to check which floating point
> exception was raised by the seti client?
>
> Regards.
>
Maybe you can run it off from gdb? Or `strace` it to a file? Usually
these things are caused by invalid 'C' runtime libraries, either
corrupt, "installed by just making a sim-link to something that
was presumed to be close to what the application was compiled with",
or an error in mem-mapping.
Another very-real possibility is that somebody used floating-point
within the kernel thus corrupting the `seti` FPU state. You can
check this out by making a program that does lots of FP calculations,
perhaps the sine of a large number of values. You put the results
into one array. Then you do the exact same thing with the results
put into another array. Then just `memcmp` the arrays! You run
this in a loop for an hour. If the kernel is mucking with your FPU,
it will certainly show.
Cheers,
Dick Johnson
Penguin : Linux version 2.4.1 on an i686 machine (797.90 BogoMips).
I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 21:23 ` Richard B. Johnson
@ 2002-01-16 21:59 ` Brian Gerst
2002-01-16 22:05 ` Richard B. Johnson
0 siblings, 1 reply; 27+ messages in thread
From: Brian Gerst @ 2002-01-16 21:59 UTC (permalink / raw)
To: root; +Cc: Christian Thalinger, Zwane Mwaikambo, linux-kernel, davej
"Richard B. Johnson" wrote:
>
> On 16 Jan 2002, Christian Thalinger wrote:
>
> > On Wed, 2002-01-16 at 15:32, Zwane Mwaikambo wrote:
> > > Can you also reproduce _without_ loading NVdriver, just to make everybody
> > > happy.
> > >
> > > Thanks,
> > > Zwane Mwaikambo
> > >
> >
> > Sure, same breakdown. Maybe it's really an dual athlon xp issue as dave
> > jones mentioned. But shouldn't this also occur when i trigger a floating
> > point exception myself? Is there a way to check which floating point
> > exception was raised by the seti client?
> >
> > Regards.
> >
>
> Maybe you can run it off from gdb? Or `strace` it to a file? Usually
> these things are caused by invalid 'C' runtime libraries, either
> corrupt, "installed by just making a sim-link to something that
> was presumed to be close to what the application was compiled with",
> or an error in mem-mapping.
>
> Another very-real possibility is that somebody used floating-point
> within the kernel thus corrupting the `seti` FPU state. You can
> check this out by making a program that does lots of FP calculations,
> perhaps the sine of a large number of values. You put the results
> into one array. Then you do the exact same thing with the results
> put into another array. Then just `memcmp` the arrays! You run
> this in a loop for an hour. If the kernel is mucking with your FPU,
> it will certainly show.
Hmm, that's an interesting idea... An Athlon optimised kernel does use
the MMX/FPU registers to do mem copies. Try running a kernel compiled
for just a Pentium and see if the problem persists.
--
Brian Gerst
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 21:59 ` Brian Gerst
@ 2002-01-16 22:05 ` Richard B. Johnson
2002-01-16 22:12 ` Mark Zealey
2002-01-16 23:35 ` Christian Thalinger
0 siblings, 2 replies; 27+ messages in thread
From: Richard B. Johnson @ 2002-01-16 22:05 UTC (permalink / raw)
To: Brian Gerst; +Cc: Christian Thalinger, Zwane Mwaikambo, linux-kernel, davej
On Wed, 16 Jan 2002, Brian Gerst wrote:
> "Richard B. Johnson" wrote:
> >
> > On 16 Jan 2002, Christian Thalinger wrote:
> >
> > > On Wed, 2002-01-16 at 15:32, Zwane Mwaikambo wrote:
> > > > Can you also reproduce _without_ loading NVdriver, just to make everybody
> > > > happy.
> > > >
> > > > Thanks,
> > > > Zwane Mwaikambo
> > > >
> > >
> > > Sure, same breakdown. Maybe it's really an dual athlon xp issue as dave
> > > jones mentioned. But shouldn't this also occur when i trigger a floating
> > > point exception myself? Is there a way to check which floating point
> > > exception was raised by the seti client?
> > >
> > > Regards.
> > >
[SNIPPED...]
> > into one array. Then you do the exact same thing with the results
> > put into another array. Then just `memcmp` the arrays! You run
> > this in a loop for an hour. If the kernel is mucking with your FPU,
> > it will certainly show.
>
> Hmm, that's an interesting idea... An Athlon optimised kernel does use
> the MMX/FPU registers to do mem copies. Try running a kernel compiled
> for just a Pentium and see if the problem persists.
>
Here's a progy.. This SHOULD run forever. I assume malloc() works and
don't check the result --yes I already know that.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <malloc.h>
#include <time.h>
#include <math.h>
#define MAX_FLOAT 0x100000
int main(int args, char *argv[])
{
unsigned int seed;
double *x;
double *y;
double *z;
size_t i;
x = (double *) malloc(MAX_FLOAT * sizeof(double));
y = (double *) malloc(MAX_FLOAT * sizeof(double));
(void) time((time_t *)&seed);
for(;;)
{
srand(seed);
z = x;
for(i = 0; i < MAX_FLOAT; i++)
*z++ = cos((double) rand());
srand(seed);
z = y;
for(i = 0; i < MAX_FLOAT; i++)
*z++ = cos((double) rand());
if(memcmp(x, y, MAX_FLOAT * sizeof(double)))
break;
seed = rand();
}
fprintf(stderr, "Floating point failure\n");
return 1;
}
Cheers,
Dick Johnson
Penguin : Linux version 2.4.1 on an i686 machine (797.90 BogoMips).
I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 22:05 ` Richard B. Johnson
@ 2002-01-16 22:12 ` Mark Zealey
2002-01-16 22:23 ` Richard B. Johnson
2002-01-16 23:35 ` Christian Thalinger
1 sibling, 1 reply; 27+ messages in thread
From: Mark Zealey @ 2002-01-16 22:12 UTC (permalink / raw)
To: linux-kernel
On Wed, Jan 16, 2002 at 05:05:55PM -0500, Richard B. Johnson wrote:
> for(;;)
> {
> srand(seed);
> z = x;
> for(i = 0; i < MAX_FLOAT; i++)
> *z++ = cos((double) rand());
> srand(seed);
> z = y;
> for(i = 0; i < MAX_FLOAT; i++)
> *z++ = cos((double) rand());
> if(memcmp(x, y, MAX_FLOAT * sizeof(double)))
> break;
> seed = rand();
Um, maybe I'm not reading this properly.. why are you randing, doing 1 set and
then using different random values for the other set ?
--
Mark Zealey
mark@zealos.org
mark@itsolve.co.uk
UL++++>$ G!>(GCM/GCS/GS/GM) dpu? s:-@ a16! C++++>$ P++++>+++++$ L+++>+++++$
!E---? W+++>$ N- !o? !w--- O? !M? !V? !PS !PE--@ PGP+? r++ !t---?@ !X---?
!R- b+ !tv b+ DI+ D+? G+++ e>+++++ !h++* r!-- y--
(www.geekcode.com)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 22:12 ` Mark Zealey
@ 2002-01-16 22:23 ` Richard B. Johnson
0 siblings, 0 replies; 27+ messages in thread
From: Richard B. Johnson @ 2002-01-16 22:23 UTC (permalink / raw)
To: Mark Zealey; +Cc: linux-kernel
On Wed, 16 Jan 2002, Mark Zealey wrote:
> On Wed, Jan 16, 2002 at 05:05:55PM -0500, Richard B. Johnson wrote:
>
> > for(;;)
> > {
> > srand(seed);
^^^^^^^^^^^^^^^^
> > z = x;
> > for(i = 0; i < MAX_FLOAT; i++)
> > *z++ = cos((double) rand());
> > srand(seed);
^^^^^^^^^^^^^
> > z = y;
> > for(i = 0; i < MAX_FLOAT; i++)
> > *z++ = cos((double) rand());
> > if(memcmp(x, y, MAX_FLOAT * sizeof(double)))
> > break;
> > seed = rand();
>
> Um, maybe I'm not reading this properly.. why are you randing, doing 1 set and
> then using different random values for the other set ?
I am NOT. I am setting the seed BACK to whatever it was for the first
set with srand(seed). After the compare, I change the seed.
Cheers,
Dick Johnson
Penguin : Linux version 2.4.1 on an i686 machine (797.90 BogoMips).
I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 22:05 ` Richard B. Johnson
2002-01-16 22:12 ` Mark Zealey
@ 2002-01-16 23:35 ` Christian Thalinger
1 sibling, 0 replies; 27+ messages in thread
From: Christian Thalinger @ 2002-01-16 23:35 UTC (permalink / raw)
To: Richard B. Johnson; +Cc: Brian Gerst, Zwane Mwaikambo, linux-kernel, davej
On Wed, 2002-01-16 at 23:05, Richard B. Johnson wrote:
> Here's a progy.. This SHOULD run forever. I assume malloc() works and
> don't check the result --yes I already know that.
>
[snip]
It ran for about 70min on both cpu's (started twice) and no problem
occured. Still have to try the pentium optimized kernel.
Regards.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: floating point exception
2002-01-16 13:14 ` Bruce Harada
2002-01-16 20:06 ` Christian Thalinger
@ 2002-01-17 19:26 ` bill davidsen
1 sibling, 0 replies; 27+ messages in thread
From: bill davidsen @ 2002-01-17 19:26 UTC (permalink / raw)
To: linux-kernel
In article <1011211577.1617.4.camel@sector17.home.at>,
Christian Thalinger <e9625286@student.tuwien.ac.at> wrote:
| On Wed, 2002-01-16 at 14:14, Bruce Harada wrote:
| > On Wed, 16 Jan 2002 12:58:35 +0100 (CET)
| > Dave Jones <davej@suse.de> wrote:
| >
| > > On 16 Jan 2002, Christian Thalinger wrote:
| > >
| > > > I mentioned in my first mail the dual tyan, so athlon xp, no fpu
| > > > emulator ;-) and no oops messages.
| > >
| > > Dual Athlon XP problem. Thanks for playing.
| >
| > Interesting. That's the first actual report I've seen of problems caused by
| > using XPs instead of MPs. I'd been wondering if I could get away with XPs for
| > my next SMP box; now I know better ;)
|
| Don't be too scared. Everything works except this seti thingy.
Does this run correctly for UP? And is this the right version of
setiathome for this CPU. Not using SSE7, 4Dthen, or some other
proprietary FP method? And until it is proven to work with the MP part,
should it ever actually be shipped instead of advertized, I wouldn't be
totally sure about XP or kernel being at fault, or even RAM problems
under load, etc.
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Floating Point Exception
@ 2002-12-27 14:18 Nandakumar NarayanaSwamy
0 siblings, 0 replies; 27+ messages in thread
From: Nandakumar NarayanaSwamy @ 2002-12-27 14:18 UTC (permalink / raw)
To: linux-kernel
Hi All,
I am getting a floating point exception when i run my code.
I am having a IDT 32334 MIPS processor in my board and
i am using cross compiler to build the image.
When the program is executed, it comes out after sometime
with the message "Floating point exception". No other dump is
displayed. Seems to be having some illegal floating point
operations
in my code.
Can anyone suggest me the reason why i am getting this and
how to solve this?
Thanks in advance,
Nanda
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2002-12-27 14:12 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-13 12:43 floating point exception Christian Thalinger
2002-01-15 23:28 ` Brian Gerst
2002-01-16 11:45 ` Christian Thalinger
2002-01-16 11:58 ` Dave Jones
2002-01-16 13:14 ` Bruce Harada
2002-01-16 20:06 ` Christian Thalinger
2002-01-17 19:26 ` bill davidsen
2002-01-16 13:52 ` Brian Gerst
2002-01-16 14:28 ` M. Edward (Ed) Borasky
-- strict thread matches above, loose matches on Subject: below --
2002-01-14 10:56 Zwane Mwaikambo
2002-01-14 21:26 ` Christian Thalinger
2002-01-15 14:34 ` Zwane Mwaikambo
2002-01-15 14:46 ` Richard B. Johnson
2002-01-15 18:19 ` Christian Thalinger
2002-01-15 18:31 ` Richard B. Johnson
2002-01-15 18:49 ` Christian Thalinger
2002-01-16 5:45 ` Zwane Mwaikambo
2002-01-16 11:55 ` Christian Thalinger
2002-01-16 14:32 ` Zwane Mwaikambo
2002-01-16 20:26 ` Christian Thalinger
2002-01-16 21:23 ` Richard B. Johnson
2002-01-16 21:59 ` Brian Gerst
2002-01-16 22:05 ` Richard B. Johnson
2002-01-16 22:12 ` Mark Zealey
2002-01-16 22:23 ` Richard B. Johnson
2002-01-16 23:35 ` Christian Thalinger
2002-12-27 14:18 Floating Point Exception Nandakumar NarayanaSwamy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox