public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Erik Bourget <erik@midmaine.com>
To: andre@linux-ide.org
Cc: linux-kernel@vger.kernel.org
Subject: SiI680 oops/panic
Date: Thu, 09 Oct 2003 17:18:37 -0400	[thread overview]
Message-ID: <87oewqt8ia.fsf@loki.odinnet> (raw)


(Cc to lkml in case anybody else has any intuition).

Hello Andre;

A while ago I spoke with you about my funky SiI680 experiences.  I have since
decided that my Hitachi "DeathStar" 180GXP harddrives were all faulty, and
replaced them with Western Digital..  This was bad.  On the other hand, the
kernel /DID/ oops/panic a few times, and I finally managed to drive to the
datacenter before somebody rebooted it.

And I took a shot with my trusty digital camera.
http://tacos.sus.mcgill.ca/~erik/oops-panic.jpg (24,457b) 
(textual bits have been re-typed below)

Kernel: 2.4.21, SMP (also happened on 2.4.22, also SMP)
(Note that the box has only one CPU, and no 'hyperthreading')

Text from it:

printing eip:
3d3d3d3d
*pde = 00000000
Oops: 0000
CPU:   0
EIP:  0010:[<3d3d3d3d>]  Not tainted
EFLAGS: 00010046
...
Process nfsd (pid: 200)
...
Code: Bad EIP value.
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

Hardware:
  Dell "650" 1U server, P4 2.4GHz, 512MB, 2x120-GB Hitachi 180GXP DeskStar
  drives in RAID-1 configuration.

Software: vanilla Debian woody, vanilla kernel.org kernels.

Load on the drives was Constant and Extreme.  The machines serve as mail
storage for our ISP's mail system.  The drives were running a constant synch
process that checked a database to see which users had authenticated with the
mail system (via POP3, IMAP, etc - such that they might have deleted mail) and
ran rsync to an identical machine over a 1000mbit link on their directories.
Literally -

while(1) {
         @userlist = changed_directories();
         foreach (@userlist) {
                 do_rsync(localhost:$_, remotehost:$_);
         }
         sleep(30);
}

Can you make heads or tails of this?  My first thought is that some driver
isn't handling faulty hardware in an error-tolerant way.

Thanks for your time;

- Erik Bourget


             reply	other threads:[~2003-10-09 21:19 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-09 21:18 Erik Bourget [this message]
2003-10-09 21:33 ` SiI680 oops/panic Bartlomiej Zolnierkiewicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87oewqt8ia.fsf@loki.odinnet \
    --to=erik@midmaine.com \
    --cc=andre@linux-ide.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox