public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* fastfail operation and retries
@ 2005-04-19 17:19 Andreas Herrmann
  2005-04-21 16:42 ` Patrick Mansfield
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Herrmann @ 2005-04-19 17:19 UTC (permalink / raw)
  To: Linux SCSI

Hi,

I have question(s) regarding the fastfail operation of the SCSI stack.

Performing multipath-tests with an IBM ESS I encountered problems.
During certain operations on an ESS (quiesce/resume and such) requests
on all paths fail temporarily with an data underrun (resid is set in
the FCP-response).  In another situation abort sequences happen (see
FC-FS).

In both cases it is not a path failure but the device (ESS) reports
error conditions temporarily (some seconds).

Now on error on the first path the multipath layer initiates failover
to other available path(s) where requests will immediately fail.

Using linux-2.4 and LVM such problems did not occure. There were
enough retries (5 for each path) to handle such situations.

Now if the FASTFAIL flag is set the SCSI stack prevents retries for
failed SCSI commands.

Problem is that the multipath layer cannot distinguish between path
and device failures (and won't do any retries for the failed request
on the same path anyway).

How can an lld force the SCSI stack to retry a failed scsi-command
(without using DID_REQUEUE or DID_IMM_RETRY, which both do not change
the retry counter).

What about a DID_FORCE_RETRY ?  Or is there any outlook when there
will be a better interface between the SCSI stack and the multipath
layer to properly handle retries.


Regards,

Andreas


^ permalink raw reply	[flat|nested] 7+ messages in thread
* Re: fastfail operation and retries
@ 2005-04-20  8:10 Andreas Herrmann
  0 siblings, 0 replies; 7+ messages in thread
From: Andreas Herrmann @ 2005-04-20  8:10 UTC (permalink / raw)
  To: 由渊霞; +Cc: Linux SCSI

        ??? <yxyou@yahoo.com.cn> wrote:
        20.04.2005 03:17
 
> what multipath are you using? Software, or hardware,
> or both?

We are using udm with evms (Linux on zSeries).
Hardware setup is:
- switched fabric FC-SAN,
- 4 paths to each FC-LUN on the ESS 800

All 4 paths are "failing fast" during operations on
the ESS and our stress test tool encounteres I/O-errors.


Regards,

Andreas


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-04-22  0:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-19 17:19 fastfail operation and retries Andreas Herrmann
2005-04-21 16:42 ` Patrick Mansfield
2005-04-21 19:54   ` Lars Marowsky-Bree
2005-04-21 22:13     ` Patrick Mansfield
2005-04-21 22:52       ` Lars Marowsky-Bree
2005-04-22  0:22         ` Patrick Mansfield
  -- strict thread matches above, loose matches on Subject: below --
2005-04-20  8:10 Andreas Herrmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox