public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Michal Jaegermann <michal@harddata.com>
To: linux-kernel@vger.kernel.org
Subject: Re: BUG: Global FPU corruption in 2.2
Date: Thu, 19 Apr 2001 14:18:44 -0600	[thread overview]
Message-ID: <20010419141844.A26200@mail.harddata.com> (raw)
In-Reply-To: <cpx7l0g3mfk.fsf@goat.cs.wisc.edu>
In-Reply-To: <cpx7l0g3mfk.fsf@goat.cs.wisc.edu>; from zandy@cs.wisc.edu on Thu, Apr 19, 2001 at 11:05:03AM -0500

[-- Attachment #1: Type: text/plain, Size: 2054 bytes --]

On Thu, Apr 19, 2001 at 11:05:03AM -0500, Victor Zandy wrote:
> 
> We have found that one of our programs can cause system-wide
> corruption of the x86 FPU under 2.2.16 and 2.2.17.
....
> 
> We see this problem on dual 550MHz Xeons with 1GB RAM.

Hm, I started to wonder if this is not somewhat related to a recent
report I got.  "The victim" was running 2.2.19 (basically) on an SMP
Alpha UP2000+ with two 800 MHz processors.  He managed to reduce the
problem to a rather small test case and I attach sources,  Makefile and
a "loop.sh" driver as a shar archive if you want to have a closer look.

This "loop.sh" simply fires triplets of "harry" process in a loop.
The guy hit by this gets apparently random floating point exceptions
starting with roughly sixth process and later intervals between bombs
will vary.  I have also 'strace' outputs from failing processes but
they are not telling very much.  'gdb' is also not very illuminating:

Program received signal SIGFPE, Arithmetic exception.
0x1200010a8 in vadd_ (a=0x11fff21e4, ia=0x120003294, b=0x11fff7004, 
    ib=0x120003294, c=0x11fffbe20, ic=0x120003294, n=0x11ffffc70) at vadd.f:99
99               C(CI) = A(AI) + B(BI)
Current language:  auto; currently fortran

(gdb) p *ia
$10 = 1
(gdb) p *ib
$11 = 1
(gdb) p *ic
$12 = 1
(gdb) p *n
Cannot access memory at address 0x4
(gdb) p *(0x11ffffc70)
$13 = 1024

(gdb) info locals
n = (PTR TO -> ( integer )) 0x4
__g77_expr_0 = 10


He tells me that he is getting that on two different machines he has
around.

The trouble is that I tried to repeat that with different hardware,
kernels, compilers and libraries and I failed even on SMP; but I got an
access to a box with only 667 MHz processors.  OTOH he is running
right now 2.4.3-ac9 plus Andrea Arcangeli patches for rw semaphores
on Alpha and he reports that the problem went away (and, hopefuly,
nothing else will crop out :-).

Anybody can offer an insight what that may really be?  It may be,
of course, totally unrelated to this report from Victor Zandy.

  Michal
  michal@harddata.com


[-- Attachment #2: fpbomb.shar --]
[-- Type: application/x-shar, Size: 12565 bytes --]

  reply	other threads:[~2001-04-19 20:19 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-04-19 16:05 BUG: Global FPU corruption in 2.2 Victor Zandy
2001-04-19 20:18 ` Michal Jaegermann [this message]
2001-04-20 18:50 ` Victor Zandy
2001-04-20 19:07   ` Richard B. Johnson
2001-04-20 19:20     ` Victor Zandy
2001-04-20 19:44       ` Richard B. Johnson
2001-04-20 19:23     ` Ulrich Drepper
2001-04-20 19:37       ` Richard B. Johnson
2001-04-20 20:20         ` Victor Zandy
2001-04-20 21:44         ` Ulrich Drepper
2001-04-22  1:46           ` Richard B. Johnson
2001-04-22  2:18             ` Alan Cox
2001-04-22  2:30               ` Richard B. Johnson
2001-04-22 18:39           ` David Konerding
2001-04-22 18:59             ` Alan Cox
2001-04-22 20:59 ` kees
2001-04-23 16:11 ` Christian Ehrhardt
2001-04-23 18:44   ` Erik Paulson
2001-04-24 16:10   ` Linus Torvalds
2001-04-24 16:25     ` Alan Cox
2001-04-24 16:56     ` Christian Ehrhardt
2001-04-24 20:15       ` Michal Jaegermann
2001-04-24 19:49     ` BUG: USB/Reboot Collectively Unconscious
2001-04-24 21:41       ` Alan Cox
2001-04-25 12:37         ` Collectively Unconscious
2001-04-30 22:46           ` Alan Cox
2001-04-27 12:18         ` Collectively Unconscious
  -- strict thread matches above, loose matches on Subject: below --
2001-04-24  5:33 BUG: Global FPU corruption in 2.2 alad
2001-04-24  7:56 alad
2001-04-24  8:56 alad
2001-04-24 13:05 Victor Zandy
2001-04-24 16:24 ` Linus Torvalds
2001-04-24 16:47 ` Christian Ehrhardt
2001-04-24 18:09   ` Victor Zandy
2001-04-24 18:21 Victor Zandy
2001-04-24 18:37 ` Alan Cox
2001-04-24 19:17   ` Victor Zandy
2001-04-24 19:51     ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010419141844.A26200@mail.harddata.com \
    --to=michal@harddata.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox