From: James Bottomley <James.Bottomley@SteelEye.com>
To: "Darrick J. Wong" <djwong@us.ibm.com>
Cc: linux-scsi <linux-scsi@vger.kernel.org>,
Alexis Bruemmer <alexisb@us.ibm.com>
Subject: Re: [PATCH 00/12] Roll-up of sas_ata patches
Date: Sun, 04 Feb 2007 09:11:46 -0600 [thread overview]
Message-ID: <1170601907.3424.4.camel@mulgrave.il.steeleye.com> (raw)
In-Reply-To: <45C5A581.1070504@us.ibm.com>
On Sun, 2007-02-04 at 01:21 -0800, Darrick J. Wong wrote:
> James Bottomley wrote:
>
> > There's a problem somewhere with your error handler changes (which I
> > picked up thanks to the problems with the V28 firmware). What I see
> > without your changes is that for a directly attached SATA device, when
> > the firmware begins its death spiral, the commands all return and
> > eventually send I/O errors to the filesystem, With your patch series
> > applied, it just loops forever giving messages like:
> >
> > Feb 3 12:07:06 localhost kernel: aic94xx: escb_tasklet_complete: phy5: LINK_RESET_ERROR
> > Feb 3 12:07:06 localhost kernel: aic94xx: phy5: Receive FIS timeout
> > Feb 3 12:07:06 localhost kernel: aic94xx: phy5: retries:0 performing link reset seq
> > Feb 3 12:07:06 localhost kernel: sas: --- Exit sas_scsi_recover_host
> > Feb 3 12:07:06 localhost kernel: aic94xx: control_phy_tasklet_complete: phy5, lrate:0x8, proto:0xe
> > Feb 3 12:07:06 localhost kernel: sas: Enter sas_scsi_recover_host
> > Feb 3 12:07:06 localhost kernel: sas: --- Exit sas_scsi_recover_host
> > Feb 3 12:07:06 localhost kernel: sas: Enter sas_scsi_recover_host
> > Feb 3 12:07:06 localhost kernel: sas: --- Exit sas_scsi_recover_host
> > Feb 3 12:07:06 localhost kernel: sas: Enter sas_scsi_recover_host
> > Feb 3 12:07:06 localhost kernel: sas: --- Exit sas_scsi_recover_host
>
> Interesting, since the opposite happens with SAS disks. :)
Well, the initial error is a firmware induced drive error of some type.
> The infinite loop is usually what happens if a scsi_cmnd gets pulled off
> the eh queue without being scsi_eh_finish_cmnd()'d. Can you send me the
> whole dmesg? It's possible that we're trying to abort a command, which
> of course fails for a SATA disk, so we try bigger and bigger hammers....
> and the big hammers don't call scsi-eh-finish-cmd.
I've put the full log from detection of the aic94xx to forced power off
(all 512k of it) at
http://www2.kernel.org:/pub/linux/kernel/people/jejb/klog.aic94xx.failure.txt
(give it a while for the kernel.org mirrors to propagate)
> Did these SATA link reset errors only start showing up after the v28
> firmware patch, or has this always happened? I've noticed lately that I
> get link reset errors if I run a short exercise on an ext3 filesystem on
> a SATA disk, yet dd exercise runs just fine. But I had also thought
> that it was just my flaky hardware. :)
Er ... no idea ... The problem only shows up with V28 firmware, so I've
never seen a SATA disc fail with the V17 firmware.
prev parent reply other threads:[~2007-02-04 15:12 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-30 9:15 [PATCH 00/12] Roll-up of sas_ata patches Darrick J. Wong
2007-02-03 22:32 ` James Bottomley
2007-02-04 9:21 ` Darrick J. Wong
2007-02-04 15:11 ` James Bottomley [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1170601907.3424.4.camel@mulgrave.il.steeleye.com \
--to=james.bottomley@steeleye.com \
--cc=alexisb@us.ibm.com \
--cc=djwong@us.ibm.com \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox