From: Mike Anderson <andmike@linux.vnet.ibm.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
Boaz Harrosh <bharrosh@panasas.com>,
SCSI development list <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH] SCSI: handle HARDWARE_ERROR sense correctly
Date: Thu, 4 Dec 2008 15:39:42 -0800 [thread overview]
Message-ID: <20081204233942.GA4950@linux.vnet.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.44L0.0812041629320.2577-100000@iolanthe.rowland.org>
Alan Stern <stern@rowland.harvard.edu> wrote:
> On Thu, 4 Dec 2008, James Bottomley wrote:
>
> > On Thu, 2008-12-04 at 15:49 -0500, Alan Stern wrote:
> > > This patch (as1183) fixes a bug in scsi_check_sense(). The routine is
> > > documented as returning one of SUCCESS, FAILED, or NEEDS_RETRY. But
> > > in the HARDWARE_ERROR case it can return ADD_TO_MLQUEUE. And since it
> > > does this without bothering to increment the retry count, it can lead
> > > to an infinite retry loop.
> > >
> > > The fix is to return NEEDS_RETRY instead. Then the caller,
> > > scsi_decide_disposition(), will do the right thing.
> >
> > OK, but why?
> >
> > The current behaviour is to retry the error until the command timeout
> > expires, which, I think is what was needed by the annoying arrays that
> > have retryable hardware errors.
> >
Yes there are some arrays that need this behavior. The two users:
usb disks and the devinfo entries with BLIST_RETRY_HWERROR appear to have
two different expected behaviors.
> > What bug would this patch fix? Because I can see it causing problems
> > with the arrays that originally reported this problem.
>
> You're right and I was wrong. It only _appeared_ to be an infinite
> retry loop because there were 6 iterations allowed and each iteration
> had a 60-second timeout. So I retract the patch submission.
>
> Waiting six minutes for a command to complete is a bit ridiculous,
> though -- especially when the user program generating that command
> decides to reissue it a few times. Can we do anything to expedite the
> failure?
A short term hack until retry policy is aligned and updated would be to
use the sdev_bflags to differentiate the usb case from use of the
BLIST_RETRY_HWERROR bflag.
> For example, does it really make sense for scsi_softirq_done
> to multiply cmd->allowed by rq->timeout? After all, if a command
> aborts with a timeout instead of failing outright, what point is there
> in retrying it? The proper approach would have been to use a longer
> timeout initially.
>
The wait_for is used for more than retries of timeouts.
I had thought it might be a good idea to expose the wait_for value and
then users could control the wait_for behavior if needed by using a udev
rule to set it near the IO timeout value if so required.
> By the way, is there any reason to keep scmd->retries now? If commands
> are going to be retried until they timeout, why bother to count the
> retries?
Previously I had submitted some patches on scsi mid retry with a short text
on current retry policy (this cover mid retry policy vs
scsi_io_completion, which should be unified).
http://marc.info/?l=linux-scsi&m=122210133628085&w=2
I will try to refresh my patches with a updated policy document and also
align that with the changes to scsi_io_completion posted prior to
re-submit.
-andmike
--
Michael Anderson
andmike@linux.vnet.ibm.com
next prev parent reply other threads:[~2008-12-04 23:39 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-04 20:49 [PATCH] SCSI: handle HARDWARE_ERROR sense correctly Alan Stern
2008-12-04 21:02 ` James Bottomley
2008-12-04 21:45 ` Alan Stern
2008-12-04 23:39 ` Mike Anderson [this message]
2008-12-08 15:10 ` Alan Stern
2008-12-16 15:27 ` Alan Stern
2008-12-16 19:14 ` James Bottomley
2008-12-16 19:56 ` Alan Stern
2008-12-16 21:49 ` James Bottomley
2008-12-17 15:09 ` Alan Stern
2008-12-05 14:41 ` Kai Makisara
2008-12-05 15:45 ` James Bottomley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081204233942.GA4950@linux.vnet.ibm.com \
--to=andmike@linux.vnet.ibm.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=bharrosh@panasas.com \
--cc=linux-scsi@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox