From: Mike Anderson <andmike@us.ibm.com>
To: James Bottomley <James.Bottomley@SteelEye.com>
Cc: "Darrick J. Wong" <djwong@us.ibm.com>,
linux-scsi@vger.kernel.org, alexisb@us.ibm.com
Subject: Re: timeout during sas discovery (aic94xx)
Date: Tue, 29 Aug 2006 09:03:07 -0700 [thread overview]
Message-ID: <20060829160307.GA11028@us.ibm.com> (raw)
In-Reply-To: <1156859623.3458.3.camel@mulgrave.il.steeleye.com>
James Bottomley <James.Bottomley@SteelEye.com> wrote:
> On Mon, 2006-08-28 at 22:44 -0700, Darrick J. Wong wrote:
> > Uh... I don't think that phy_reset function ever gets called. My
> > ten-second grep of the libsas/aic94xx code doesn't yield and takers.
> > Maybe one of those functions that gets called after time index 575.791
> > should be doing that?
>
> I see the same thing occasionally in my sata on expanders setup.
>
> The problem is that the error handling in the SMP functions isn't
> robust. Try this patch; it works for me(tm), but it's obviously wrong
> since it simply blasts a reset.
>
"Aug 27 23:32:02 elm3a176 kernel: [ 575.791927] sas: command
0xffff8100674d9c80, task
0xffff8100727ede00, timed out: EH_NOT_HANDLED
Aug 27 23:32:02 elm3a176 kernel: [ 575.801352] sas: Enter
sas_scsi_recover_host
Aug 27 23:32:02 elm3a176 kernel: [ 575.806022] sas: going over list...
Aug 27 23:32:02 elm3a176 kernel: [ 575.809894] sas: trying to find task
0xffff8100727ede00
Aug 27 23:32:02 elm3a176 kernel: [ 575.815513] sas: sas_scsi_find_task:
aborting task
0xffff8100727ede00
Aug 27 23:32:07 elm3a176 kernel: [ 580.818573] aic94xx: tmf timed out
Aug 27 23:32:07 elm3a176 kernel: [ 580.822369] aic94xx: tmf came back
"
I think this failure mode is a different path than what your patch tries to
address. We sending a inquiry to the device and coming through the
standard IO path and not through sas_execute_task.
I still think for these cases that we need to be running the patch I
previous sent to the list to try and get the abort to work (this patch is
not in the git tree so one needs to add this on top of the git source).
This will not solve the timeout, but would at least address the tmf time
out.
We need to also address the first issue of the inquiry timeout. Previous
runs showed that we where hitting this error a lot on the inquiry to the
Vitesse SES device which the adp driver has created a work around (unclear
if the work around solves the issue or not).
-andmike
--
Michael Anderson
andmike@us.ibm.com
next prev parent reply other threads:[~2006-08-29 16:05 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-28 23:18 timeout during sas discovery (aic94xx) malahal
2006-08-29 5:44 ` Darrick J. Wong
2006-08-29 13:53 ` James Bottomley
2006-08-29 16:03 ` Mike Anderson [this message]
2006-08-29 16:22 ` James Bottomley
2006-08-29 16:50 ` Mike Anderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060829160307.GA11028@us.ibm.com \
--to=andmike@us.ibm.com \
--cc=James.Bottomley@SteelEye.com \
--cc=alexisb@us.ibm.com \
--cc=djwong@us.ibm.com \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox