public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Bernd Schubert <bs@q-leap.de>
Cc: Matthew Wilcox <matthew@wil.cx>, linux-scsi@vger.kernel.org
Subject: Re: [PATCH] scsi device recovery
Date: Wed, 12 Dec 2007 10:59:36 -0500	[thread overview]
Message-ID: <1197475177.4203.29.camel@localhost.localdomain> (raw)
In-Reply-To: <200712121536.10665.bs@q-leap.de>


On Wed, 2007-12-12 at 15:36 +0100, Bernd Schubert wrote:
> On Wednesday 12 December 2007 14:39:27 Matthew Wilcox wrote:
> > On Wed, Dec 12, 2007 at 01:54:14PM +0100, Bernd Schubert wrote:
> > > below is a patch introducing device recovery, trying to prevent i/o
> > > errors when a DID_NO_CONNECT or SOFT_ERROR does happen.
> >
> > Why doesn't the regular scsi_eh do what you need?
> 
> First of all, it is presently simply not called when the two errors above do 
> happen. This could be changed, of course.

Erm, I think you'll find the error handler does activate on
DID_SOFT_ERROR.  It causes a retry via the eh.  DID_NO_CONNECT is an
immediate error with no eh intervention because it means that the target
went away.  Handling this as a retryable error isn't an option because
it will interfere with hotplug.

> Secondly, I think scsi_eh is in most cases doing too much. We are fighting 
> with flaky Infortrend boxes here, and scsi_eh sometimes manages to crash 
> their scsi channels. In most cases it is sufficient to stall any io to the 
> device and then to resume.

But that's basically the default behaviour of the error handler (stall
then resume).

> For most scsi devices one probably doesn't need a suspend time or it can be 
> very small, this still needs to become configurable via sysfs.

You mean a wait time beyond what the error handler currently does
(basically it waits for the quiesce, begins error handling and then
sends a test unit ready when it finishes before restarting).

> Thirdly, scsi_eh doesn't give up, in most cases, when the scsi channel of a 
> Infortrend box crashed, it tried forever to recover.
> To improve this is still on my todo list.

Could you send traces for this.  I thought the error handler had been
fixed over the last few years always to terminate.  If there's a case
where it doesn't, this needs fixing.

James



  reply	other threads:[~2007-12-12 15:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-12 12:54 [PATCH] scsi device recovery Bernd Schubert
2007-12-12 13:39 ` Matthew Wilcox
2007-12-12 14:36   ` Bernd Schubert
2007-12-12 15:59     ` James Bottomley [this message]
2007-12-12 17:54       ` Bernd Schubert
2007-12-13 14:18         ` James Bottomley
2007-12-14 11:26           ` fusion problem (was Re: [PATCH] scsi device recovery) Bernd Schubert
2007-12-14 12:04           ` [PATCH] scsi device recovery Bernd Schubert
2007-12-14 12:22             ` Matthew Wilcox
2007-12-14 12:28               ` Bernd Schubert
2007-12-14 14:35             ` James Bottomley
2007-12-14 15:26               ` Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1197475177.4203.29.camel@localhost.localdomain \
    --to=james.bottomley@hansenpartnership.com \
    --cc=bs@q-leap.de \
    --cc=linux-scsi@vger.kernel.org \
    --cc=matthew@wil.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox