From: Andrew Morton <akpm@linux-foundation.org>
To: Max Krasnyansky <maxk@qualcomm.com>
Cc: linux-kernel@vger.kernel.org, linux-ide@vger.kernel.org
Subject: Re: Strange freezes (seems like SATA related)
Date: Thu, 1 Nov 2007 16:40:46 -0700 [thread overview]
Message-ID: <20071101164046.462f40f0.akpm@linux-foundation.org> (raw)
In-Reply-To: <47261043.5020907@qualcomm.com>
On Mon, 29 Oct 2007 09:54:27 -0700
Max Krasnyansky <maxk@qualcomm.com> wrote:
> A couple of HP xw9300 machines (dual Opterons) started freezing up.
> We're running on 2.6.22.1 on them. Freezes a somewhere weird. VGA console is alive
> (I can switch vts, etc) but everything else is dead (network, etc).
> Unfortunately SYSRQ was not enabled and I could not get backtraces and stuff.
>
> Hooked up serial console and the only error that shows up is this.
>
> ata1: EH in ADMA mode, notifier 0x1 notifier_error 0x0 gen_ctl 0x1581000 status 0x1540 next cpb count 0x0 next cpb idx 0x0
> ata1: CPB 0: ctl_flags 0xd, resp_flags 0x1
> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata1.00: cmd ca/00:08:57:00:80/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 out
> res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> Descriptor sense data with sense descriptors (in hex):
> end_request: I/O error, dev sda, sector 8388695
> Buffer I/O error on device sda1, logical block 1048579
> lost page write due to I/O error on sda1
> sd 0:0:0:0: [sda] Write Protect is off
>
> I see a bunch of those and then the box just sits there spewing this periodically
>
> ata1: EH in ADMA mode, notifier 0x1 notifier_error 0x0 gen_ctl 0x1581000 status 0x1540 next cpb count 0x0 next cpb idx 0x0
> ata1: CPB 0: ctl_flags 0xd, resp_flags 0x1
> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata1.00: cmd ca/00:08:4f:00:f8/00:00:00:00:00/e1 tag 0 cdb 0x0 data 4096 out
> res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>
> SMART selftest on the drive passed without errors.
>
> Here is how this machine looks like
>
> ...
So this happens on more than one machine?
The kernel shouldn't freeze, so even if both machines have magically
identical hardware faults, there's a kernel bug there somewhere.
I guess it would be useful to test a 2.6.23 kernel if poss. We've seen a
very large number of reports like this one in recent months (many of which
have not been responded to, btw) and perhaps someone has done something
about them.
next parent reply other threads:[~2007-11-01 23:40 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <47261043.5020907@qualcomm.com>
2007-11-01 23:40 ` Andrew Morton [this message]
2007-11-06 21:42 ` Strange freezes (seems like SATA related) Max Krasnyansky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071101164046.462f40f0.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maxk@qualcomm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).