From: Aaron Lehmann <aaronl@vitelus.com>
To: Jeff Garzik <jgarzik@pobox.com>
Cc: linux-ide@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Promise SATA oops
Date: Mon, 20 Feb 2006 20:21:27 -0800 [thread overview]
Message-ID: <20060221042127.GC11106@vitelus.com> (raw)
In-Reply-To: <20051220201719.GC15466@vitelus.com>
This crash kept happening for months, across all versions of the
kernel I tried (up through early 2.6.16 git snapshots). I ended up
buying a different SATA card, and this seems to have fixed the
problem. At around the same frequency as I experienced the nasty
hanging, I'm seeing this error message:
ata1: command 0xea timeout, stat 0x51 host_stat 0x0
ata1: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }
...but the system continues running fine. This leads me to believe
that there's a bug in the Promise SATA driver that prevents it from
gracefully handling this error condition, whatever it is. The hard
drives are model WDC WD3200JD-60K, and I couldn't find any bad blocks
on them.
On Tue, Dec 20, 2005 at 12:17:19PM -0800, Aaron Lehmann wrote:
> On Fri, Dec 02, 2005 at 11:51:09AM -0800, Aaron Lehmann wrote:
> > Still isn't stable. It froze within hours after announcing in all
> > terminals that it was disabling a certain IRQ. Now the RAID is so
> > degraded that root can't even be mounted. Was the Promise controller a
> > bad choice for a reliable setup?
> >
> > I may not have time to look at this further until late next week, but
> > I'll follow up with whatever I learn.
>
> Argh, died again!! It had been stable for over 12 days. Same error
> message, and the root md is degraded and dirty just like last time.
> This is a very severe state with high risk of data loss. When things
> went sour, terminals and most applications still kept working, but
> anything that touched the filesystem froze up. I had a shell open in a
> chroot on a ramdisk, but dmesg just hung for a few minutes and then
> exited with a "Bus error". I had no other way of examining the kernel
> log since the machine runs X.
>
> This was running 2.6.15-rc4. Crashes seem to happen less frequently
> with it than with 2.6.14.x, but when they happen they leave the RAID
> in a severe state. I also don't think 2.6.14.2 said anything about
> disabling the IRQ.
>
> I'm very desperate now. About every week I experience a crash that
> damages my RAID array to the point where it can't boot, as if the
> instability wasn't bad enough. Do I need to buy a hardware RAID card?
prev parent reply other threads:[~2006-02-21 4:21 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-12-02 4:58 Promise SATA oops Aaron Lehmann
2005-12-02 5:29 ` Jeff Garzik
2005-12-02 19:51 ` Aaron Lehmann
2005-12-03 10:09 ` Erik Slagter
2005-12-20 20:17 ` Aaron Lehmann
2005-12-27 23:51 ` Peter Smith
2006-02-21 4:21 ` Aaron Lehmann [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060221042127.GC11106@vitelus.com \
--to=aaronl@vitelus.com \
--cc=jgarzik@pobox.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.