From: Mike Anderson <andmike@linux.vnet.ibm.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
Boaz Harrosh <bharrosh@panasas.com>,
SCSI development list <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH] SCSI: handle HARDWARE_ERROR sense correctly
Date: Thu, 4 Dec 2008 15:39:42 -0800 [thread overview]
Message-ID: <20081204233942.GA4950@linux.vnet.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.44L0.0812041629320.2577-100000@iolanthe.rowland.org>
Alan Stern <stern@rowland.harvard.edu> wrote:
> On Thu, 4 Dec 2008, James Bottomley wrote:
>
> > On Thu, 2008-12-04 at 15:49 -0500, Alan Stern wrote:
> > > This patch (as1183) fixes a bug in scsi_check_sense(). The routine is
> > > documented as returning one of SUCCESS, FAILED, or NEEDS_RETRY. But
> > > in the HARDWARE_ERROR case it can return ADD_TO_MLQUEUE. And since it
> > > does this without bothering to increment the retry count, it can lead
> > > to an infinite retry loop.
> > >
> > > The fix is to return NEEDS_RETRY instead. Then the caller,
> > > scsi_decide_disposition(), will do the right thing.
> >
> > OK, but why?
> >
> > The current behaviour is to retry the error until the command timeout
> > expires, which, I think is what was needed by the annoying arrays that
> > have retryable hardware errors.
> >
Yes there are some arrays that need this behavior. The two users:
usb disks and the devinfo entries with BLIST_RETRY_HWERROR appear to have
two different expected behaviors.
> > What bug would this patch fix? Because I can see it causing problems
> > with the arrays that originally reported this problem.
>
> You're right and I was wrong. It only _appeared_ to be an infinite
> retry loop because there were 6 iterations allowed and each iteration
> had a 60-second timeout. So I retract the patch submission.
>
> Waiting six minutes for a command to complete is a bit ridiculous,
> though -- especially when the user program generating that command
> decides to reissue it a few times. Can we do anything to expedite the
> failure?
A short term hack until retry policy is aligned and updated would be to
use the sdev_bflags to differentiate the usb case from use of the
BLIST_RETRY_HWERROR bflag.
> For example, does it really make sense for scsi_softirq_done
> to multiply cmd->allowed by rq->timeout? After all, if a command
> aborts with a timeout instead of failing outright, what point is there
> in retrying it? The proper approach would have been to use a longer
> timeout initially.
>
The wait_for is used for more than retries of timeouts.
I had thought it might be a good idea to expose the wait_for value and
then users could control the wait_for behavior if needed by using a udev
rule to set it near the IO timeout value if so required.
> By the way, is there any reason to keep scmd->retries now? If commands
> are going to be retried until they timeout, why bother to count the
> retries?
Previously I had submitted some patches on scsi mid retry with a short text
on current retry policy (this cover mid retry policy vs
scsi_io_completion, which should be unified).
http://marc.info/?l=linux-scsi&m=122210133628085&w=2
I will try to refresh my patches with a updated policy document and also
align that with the changes to scsi_io_completion posted prior to
re-submit.
-andmike
--
Michael Anderson
andmike@linux.vnet.ibm.com
next prev parent reply other threads:[~2008-12-04 23:39 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-04 20:49 [PATCH] SCSI: handle HARDWARE_ERROR sense correctly Alan Stern
2008-12-04 21:02 ` James Bottomley
2008-12-04 21:45 ` Alan Stern
2008-12-04 23:39 ` Mike Anderson [this message]
2008-12-08 15:10 ` Alan Stern
2008-12-16 15:27 ` Alan Stern
2008-12-16 19:14 ` James Bottomley
2008-12-16 19:56 ` Alan Stern
2008-12-16 21:49 ` James Bottomley
2008-12-17 15:09 ` Alan Stern
2008-12-05 14:41 ` Kai Makisara
2008-12-05 15:45 ` James Bottomley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081204233942.GA4950@linux.vnet.ibm.com \
--to=andmike@linux.vnet.ibm.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=bharrosh@panasas.com \
--cc=linux-scsi@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.