Mid-layer handling of NOT_READY conditions...

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* Mid-layer handling of NOT_READY conditions...
@ 2005-01-28 23:24 Andrew Vasquez
  2005-01-29  5:46 ` Andrew Vasquez
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Vasquez @ 2005-01-28 23:24 UTC (permalink / raw)
  To: linux-scsi; +Cc: andrew.vasquez

[PREFACE: Please forgive the rather long absence on linux-scsi, I've
been occupied by several non-related projects]


All,

While stripping out the remnants of internal queuing from the qla2xxx
driver and adding-in support for various fc_host/fc_remote constructs,
I've ran into a rather peculiar problem with respect to the way the SCSI
mid-layer handles NOT_READY conditions (notably ASC 0x04 and ASCQ 0x01).

I was doing simple short-duration cable-pulls when I noticed I/O errors
would occur at unexpected times as the storage returned to the topology.
The simplest case goes like this:

      * Issue I/O to device A
      * Device A falls off the topology 
      * Driver (qla2xxx) blocks additional requests to device A via
        fc_remote_port_block() 
      * Short time later (couple of seconds) device A returns to
        topology
      * Driver logs-into device and unblocks requests via
        fc_remote_port_unblock().
      * I/O resumes

The storage still unable to process the commands returns
check-conditions (please excuse the crude printk()s):

        *** check 1148/1/5 [1:0] sdev_st=2 status=2 [6/29/0].
        *** check 1149/1/5 [1:0] sdev_st=2 status=2 [2/4/1].
        scsi_decide_disposition: sc 0 RETRY incremented 2/5
        *** check 1150/2/5 [1:0] sdev_st=2 status=2 [2/4/1].
        scsi_decide_disposition: sc 0 RETRY incremented 3/5
        *** check 1151/3/5 [1:0] sdev_st=2 status=2 [2/4/1].
        scsi_decide_disposition: sc 0 RETRY incremented 4/5
        *** check 1152/4/5 [1:0] sdev_st=2 status=2 [2/4/1].
        
while scsi_decide_disposition() agrees to retry the commands since
cmd->retries < cmd->allowed.  But when NOT_READYs persists beyond
cmd->allowed, scsi_decide_disposition() returns SUCCESS:

        scsi_decide_disposition: sc 0 2 SUCCESS 6/5 [2/4/1]

and the command then begins additional processing via:

        scsi_finish_command()
          sd_rw_itr()
            scsi_io_completion()
        
at which point, the following check is made:

        ...
        /*
         * If the device is in the process of becoming ready,
         * retry.
         */
        if (sshdr.asc == 0x04 && sshdr.ascq == 0x01) {
                scsi_requeue_command(q, cmd);
                return;
        }
        
and the command is requeued to the request-q via blk_insert_request()
and started again with:

        q->request_fn()
          scsi_request_fn()
            scsi_dispatch_cmd()
        
There seems to be two problem with this approach:

     1. As the storage continues to return NOT_READY,
        scsi_decide_disposition() blindly increments cmd->retries and
        checks against cmd->allowed, returning SUCCESS (since at this
        point cmd->retries is always greater than cmd->allowed) -- I've
        seen this condition loop several hundred times while the
        NOT_READY condition clears.
     2. as a result of the (cmd->retries > cmd->allowed) state of the
        command, if a LLDD returns any status (other than DID_OK) which
        could initiate a retry, the command is immediately failed.  As
        an example, the qla2xxx driver returns DID_BUS_BUSY in case of
        any 'transport' related problems during the exchange (dropped
        frames, FCP protocal failures, etc.).

When the qla2xxx driver managed command queuing internally, a NOT_READY
status would cause the lun-queue to be frozen for some period time while
the storage settled-down.

Would this be an approach to consider?  Or should we tackle the problem
by addressing the quirky (cmd->retries > cmd->allowed) state?

Thanks,
Andrew Vasquez

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mid-layer handling of NOT_READY conditions...
  2005-01-28 23:24 Mid-layer handling of NOT_READY conditions Andrew Vasquez
@ 2005-01-29  5:46 ` Andrew Vasquez
  2005-01-29 16:16   ` Matthew Wilcox
  2005-01-29 16:44   ` James Bottomley
  0 siblings, 2 replies; 15+ messages in thread
From: Andrew Vasquez @ 2005-01-29  5:46 UTC (permalink / raw)
  To: linux-scsi

On Fri, 2005-01-28 at 15:24 -0800, Andrew Vasquez wrote:
> ...        
> There seems to be two problem with this approach:
> 
>      1. As the storage continues to return NOT_READY,
>         scsi_decide_disposition() blindly increments cmd->retries and
>         checks against cmd->allowed, returning SUCCESS (since at this
>         point cmd->retries is always greater than cmd->allowed) -- I've
>         seen this condition loop several hundred times while the
>         NOT_READY condition clears.
>      2. as a result of the (cmd->retries > cmd->allowed) state of the
>         command, if a LLDD returns any status (other than DID_OK) which
>         could initiate a retry, the command is immediately failed.  As
>         an example, the qla2xxx driver returns DID_BUS_BUSY in case of
>         any 'transport' related problems during the exchange (dropped
>         frames, FCP protocal failures, etc.).
> 
> When the qla2xxx driver managed command queuing internally, a NOT_READY
> status would cause the lun-queue to be frozen for some period time while
> the storage settled-down.
> 

Returning back DID_IMM_RETRY for these 'transport' related conditions
would of course help in this issue -- but at the same time bring with it
several side-effects which may not be desirable.

So, beyond this particular circumstance, what would be considered a
'proper' return status for this type of event? 

> Would this be an approach to consider?  Or should we tackle the problem
> by addressing the quirky (cmd->retries > cmd->allowed) state?
> 

--
Andrew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mid-layer handling of NOT_READY conditions...
  2005-01-29  5:46 ` Andrew Vasquez
@ 2005-01-29 16:16   ` Matthew Wilcox
  2005-01-29 16:44   ` James Bottomley
  1 sibling, 0 replies; 15+ messages in thread
From: Matthew Wilcox @ 2005-01-29 16:16 UTC (permalink / raw)
  To: Andrew Vasquez; +Cc: linux-scsi

On Fri, Jan 28, 2005 at 09:46:06PM -0800, Andrew Vasquez wrote:
> On Fri, 2005-01-28 at 15:24 -0800, Andrew Vasquez wrote:
> > When the qla2xxx driver managed command queuing internally, a NOT_READY
> > status would cause the lun-queue to be frozen for some period time while
> > the storage settled-down.
> 
> Returning back DID_IMM_RETRY for these 'transport' related conditions
> would of course help in this issue -- but at the same time bring with it
> several side-effects which may not be desirable.
> 
> So, beyond this particular circumstance, what would be considered a
> 'proper' return status for this type of event? 

I suspect the 'proper' return status for this kind of event would be
to get the cmd back to the block layer asap so it can be retried down
a different path if we have multiple paths to the device.  Not that I'm
an expert ...

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mid-layer handling of NOT_READY conditions...
  2005-01-29  5:46 ` Andrew Vasquez
  2005-01-29 16:16   ` Matthew Wilcox
@ 2005-01-29 16:44   ` James Bottomley
  2005-01-29 19:34     ` Patrick Mansfield
  1 sibling, 1 reply; 15+ messages in thread
From: James Bottomley @ 2005-01-29 16:44 UTC (permalink / raw)
  To: Andrew Vasquez; +Cc: SCSI Mailing List

On Fri, 2005-01-28 at 21:46 -0800, Andrew Vasquez wrote:
> Returning back DID_IMM_RETRY for these 'transport' related conditions
> would of course help in this issue -- but at the same time bring with it
> several side-effects which may not be desirable.
> 
> So, beyond this particular circumstance, what would be considered a
> 'proper' return status for this type of event? 

Well, the correct return, since this is a condition from the storage, is
simply the check condition and the sense code (rather than having the
driver interpret it).

> > Would this be an approach to consider?  Or should we tackle the problem
> > by addressing the quirky (cmd->retries > cmd->allowed) state?

That's what I think the correct approach should be....we have a few
other quirky devices that aren't pleased with our current NOT_READY
handling.  Were you going to look into coding up a patch for this?

James



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mid-layer handling of NOT_READY conditions...
  2005-01-29 16:44   ` James Bottomley
@ 2005-01-29 19:34     ` Patrick Mansfield
  2005-01-30  1:40       ` James Bottomley
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Patrick Mansfield @ 2005-01-29 19:34 UTC (permalink / raw)
  To: James Bottomley; +Cc: Andrew Vasquez, SCSI Mailing List

On Sat, Jan 29, 2005 at 10:44:41AM -0600, James Bottomley wrote:
> On Fri, 2005-01-28 at 21:46 -0800, Andrew Vasquez wrote:
> > Returning back DID_IMM_RETRY for these 'transport' related conditions
> > would of course help in this issue -- but at the same time bring with it
> > several side-effects which may not be desirable.
> > 
> > So, beyond this particular circumstance, what would be considered a
> > 'proper' return status for this type of event? 
> 
> Well, the correct return, since this is a condition from the storage, is
> simply the check condition and the sense code (rather than having the
> driver interpret it).

But the transport hit a failure, not the storage device.

I thought Andrew hit this sequence:

	- pull / replace cable

	- IO resumes but gets NOT_READY (the device could be logging back
	  into the fibre or such)

	- a FC transport problem is hit, DID_BUSY_BUSY is returned, but
	  scmd->retries has already been exhausted by the NOT_READY

Did I misread something?

> > > Would this be an approach to consider?  Or should we tackle the problem
> > > by addressing the quirky (cmd->retries > cmd->allowed) state?
> 
> That's what I think the correct approach should be....we have a few
> other quirky devices that aren't pleased with our current NOT_READY
> handling.  Were you going to look into coding up a patch for this?

We don't track what errors caused a retry (doing so is too painful), or
reset the retries. In scsi_decide_disposition() if we get a few retry
cases for one or multiple errors, and then a different error that should
reasonably be a retry case, we return SUCCESS instead of NEEDS_RETRY.

Why not just set scmd->retries to zero in scsi_requeue_command()?

All callers are cases that we want to keep retrying if other errors are hit,
and would fix other potential retry problems, not only the NOT_READY case.

[There is one bad looking scsi_requeue_command() for UNIT_ATTENTION that
looks like it could retry forever, independent of this problem.]

Fixing the NOT_READY case to quiesce (and not incrementing retries) would
fix the problem or make it much less likely, and is still a good idea.

And as a long term goal, losing the retry count and moving to allowing all
retries for a period of time would avoid other potential problems, and not
be tied to the speed of the system.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mid-layer handling of NOT_READY conditions...
  2005-01-29 19:34     ` Patrick Mansfield
@ 2005-01-30  1:40       ` James Bottomley
  2005-01-30  2:33       ` Douglas Gilbert
  2005-01-31  7:47       ` Andrew Vasquez
  2 siblings, 0 replies; 15+ messages in thread
From: James Bottomley @ 2005-01-30  1:40 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: Andrew Vasquez, SCSI Mailing List

On Sat, 2005-01-29 at 11:34 -0800, Patrick Mansfield wrote:
> But the transport hit a failure, not the storage device.
> 
> I thought Andrew hit this sequence:
> 
> 	- pull / replace cable
> 
> 	- IO resumes but gets NOT_READY (the device could be logging back
> 	  into the fibre or such)
> 
> 	- a FC transport problem is hit, DID_BUSY_BUSY is returned, but
> 	  scmd->retries has already been exhausted by the NOT_READY
> 
> Did I misread something?

Erm, not sure.  Perhaps I'm confused.  I thought it was the *device*
that had responded NOT_READY.  Obviously, if it's the driver
manufacturing NOT_READY sense because of some transport condition, then
it needs to be correctly reported as a DID_...

James



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mid-layer handling of NOT_READY conditions...
  2005-01-29 19:34     ` Patrick Mansfield
  2005-01-30  1:40       ` James Bottomley
@ 2005-01-30  2:33       ` Douglas Gilbert
  2005-01-31  7:47       ` Andrew Vasquez
  2 siblings, 0 replies; 15+ messages in thread
From: Douglas Gilbert @ 2005-01-30  2:33 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: James Bottomley, Andrew Vasquez, SCSI Mailing List

Patrick Mansfield wrote:
> On Sat, Jan 29, 2005 at 10:44:41AM -0600, James Bottomley wrote:
> 
>>On Fri, 2005-01-28 at 21:46 -0800, Andrew Vasquez wrote:
>>
>>>Returning back DID_IMM_RETRY for these 'transport' related conditions
>>>would of course help in this issue -- but at the same time bring with it
>>>several side-effects which may not be desirable.
>>>
>>>So, beyond this particular circumstance, what would be considered a
>>>'proper' return status for this type of event? 
>>
>>Well, the correct return, since this is a condition from the storage, is
>>simply the check condition and the sense code (rather than having the
>>driver interpret it).
> 
> 
> But the transport hit a failure, not the storage device.
> 
> I thought Andrew hit this sequence:
> 
> 	- pull / replace cable
> 
> 	- IO resumes but gets NOT_READY (the device could be logging back
> 	  into the fibre or such)
> 
> 	- a FC transport problem is hit, DID_BUSY_BUSY is returned, but
> 	  scmd->retries has already been exhausted by the NOT_READY
> 
> Did I misread something?

Patrick,
I was also thinking of commenting on this. It depends on
where the failure is:
   a) between the device server (target) and a logical unit (lu)
   b) in the service delivery subsystem between the
      initiator (port) and the target (port).

James's explanation covers case a) (i.e. the device server
should constuct appropriate sense data and a SCSI status
in response to the current and future SCSI commands.
In case b) the reponse is transport dependent.
For example, in the case of SAS there are two further
situations:
    1) the failure occurs on a direct connect between the
       initiator (port) and the target (port) [e.g. between
       a HBA port and a target port on a disk].
       Then a low level state machine (phy/link layer) on
       the HBA will notice the problem
    2) the failure occurs between an expander and an end
       device (e.g. a tape drive). Then the expander issues
       a BROADCAST(CHANGE) link layer primitive which the
       initiator(s) will receive. In reponse to this the
       initiator(s) should do another discovery process
       to find the new topology (via SMP).

Also both of these situations are detected in real time
(more or less), not when the next command is issued.
New SCSI commands will fail relatively quickly when
the SAS HBA fails to open a connection to the target.
SCSI commands "in flight" to an effected target should
trigger connection timeouts in the initiator.

Doug Gilbert

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mid-layer handling of NOT_READY conditions...
  2005-01-29 19:34     ` Patrick Mansfield
  2005-01-30  1:40       ` James Bottomley
  2005-01-30  2:33       ` Douglas Gilbert
@ 2005-01-31  7:47       ` Andrew Vasquez
  2 siblings, 0 replies; 15+ messages in thread
From: Andrew Vasquez @ 2005-01-31  7:47 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: James Bottomley, SCSI Mailing List

On Sat, 2005-01-29 at 11:34 -0800, Patrick Mansfield wrote:
> On Sat, Jan 29, 2005 at 10:44:41AM -0600, James Bottomley wrote:
> > On Fri, 2005-01-28 at 21:46 -0800, Andrew Vasquez wrote:
> > > Returning back DID_IMM_RETRY for these 'transport' related conditions
> > > would of course help in this issue -- but at the same time bring with it
> > > several side-effects which may not be desirable.
> > > 
> > > So, beyond this particular circumstance, what would be considered a
> > > 'proper' return status for this type of event? 
> > 
> > Well, the correct return, since this is a condition from the storage, is
> > simply the check condition and the sense code (rather than having the
> > driver interpret it).
> 
> But the transport hit a failure, not the storage device.
> 
> I thought Andrew hit this sequence:
> 
> 	- pull / replace cable
> 
> 	- IO resumes but gets NOT_READY (the device could be logging back
> 	  into the fibre or such)
> 
> 	- a FC transport problem is hit, DID_BUSY_BUSY is returned, but
> 	  scmd->retries has already been exhausted by the NOT_READY
> 
> Did I misread something?
> 

No, that's correct -- sorry about the confusion my second email caused.
I had only inquired about the 'correct' return status in the context of
avoiding the (cmd-retries > cmd->allowed) failure.

> > > > Would this be an approach to consider?  Or should we tackle the problem
> > > > by addressing the quirky (cmd->retries > cmd->allowed) state?
> > 
> > That's what I think the correct approach should be....we have a few
> > other quirky devices that aren't pleased with our current NOT_READY
> > handling.  Were you going to look into coding up a patch for this?
> 
> We don't track what errors caused a retry (doing so is too painful), or
> reset the retries. In scsi_decide_disposition() if we get a few retry
> cases for one or multiple errors, and then a different error that should
> reasonably be a retry case, we return SUCCESS instead of NEEDS_RETRY.
> 
> Why not just set scmd->retries to zero in scsi_requeue_command()?
> 

This is exactly what I was thinking would be a fairly straight-forward
approach at solving the problem...
 
> All callers are cases that we want to keep retrying if other errors are hit,
> and would fix other potential retry problems, not only the NOT_READY case.
> 

given this fact.  I could code up a quick patch if this would be
acceptable???

> [There is one bad looking scsi_requeue_command() for UNIT_ATTENTION that
> looks like it could retry forever, independent of this problem.]
>

We could also retry forever if the storage never transitions from its
NOT_READY state (unlikely - unless totally borken).

> Fixing the NOT_READY case to quiesce (and not incrementing retries) would
> fix the problem or make it much less likely, and is still a good idea.
> 

Yes, pounding on the storage box seems like a rather unfriendly
approach :-|

--
Andrew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mid-Layer handling of NOT READY conditions...
@ 2005-01-31  9:46 EXT / DEVOTEAM VAROQUI Christophe
  0 siblings, 0 replies; 15+ messages in thread
From: EXT / DEVOTEAM VAROQUI Christophe @ 2005-01-31  9:46 UTC (permalink / raw)
  To: 'linux-scsi@vger.kernel.org'

Hello,

a related discussion is ongoing in the multipath area.

Edward Goggin, EMC, noted a LU can be remapped anytime, by way of an HBA
recabling, or a LUN masking reconfiguration or a LU renaming. If IO are
queued to sda (LU id == 1234), sda goes unreachable (say because of a cable
unplug), when sda come back online it may not be 1234 anymore but 4321.
Requeuing the IO to sda will then silently corrupt 4321.

There needs to be a way to ask the HBA driver to error out immediatly IO
submitted to sda.
Can you confirm a "timeout" driver module parameter set to 0 can bring us
this behaviour ?

regards,
cvaroqui

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Mid-Layer handling of NOT READY conditions...
@ 2005-01-31 14:07 goggin, edward
  0 siblings, 0 replies; 15+ messages in thread
From: goggin, edward @ 2005-01-31 14:07 UTC (permalink / raw)
  To: 'linux-scsi@vger.kernel.org'
  Cc: 'EXT / DEVOTEAM VAROQUI Christophe'

> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org 
> [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of EXT / 
> DEVOTEAM VAROQUI Christophe
> Sent: Monday, January 31, 2005 4:46 AM
> To: 'linux-scsi@vger.kernel.org'
> Subject: Re: Mid-Layer handling of NOT READY conditions...
> 
> Hello,
> 
> a related discussion is ongoing in the multipath area.
> 
> Edward Goggin, EMC, noted a LU can be remapped anytime, by 
> way of an HBA
> recabling, or a LUN masking reconfiguration or a LU renaming. 
> If IO are
> queued to sda (LU id == 1234), sda goes unreachable (say 
> because of a cable
> unplug), when sda come back online it may not be 1234 anymore 
> but 4321.
> Requeuing the IO to sda will then silently corrupt 4321.
> 
> There needs to be a way to ask the HBA driver to error out 
> immediatly IO
> submitted to sda.
> Can you confirm a "timeout" driver module parameter set to 0 
> can bring us
> this behaviour ?
> 

Looks like the SCSI mid-layer may be already doing __most__ of the
"right thing" - but (1) the right thing isn't happening for EMC CLARiion
or Symmetrix logical units (maybe other storage also) because they are
not being treated as "removable" units by the mid-layer and (2) there
Does not seem to be any refresh of the cached inquiry data
(vendor/model/rev) in the mid-layer's scsi_device data structure.
Not clear to me if these storage systems should be setting the RMB
bit of the standard inquiry reply (which CLARiion and Symmetrix are
not doing) or if the linux SCSI mid-layer should just be treating all
SAN storage units as removable units, independent of the state of this bit.

The SCSI mid-layer (scsi_io_completion()) detects a SCSI sense key of
UNIT_ATTENTION after most any attempt to access the SCSI logical unit
With the "new" identity for the first time.  As long as the logical unit
is viewed as "removable" media, the most of right thing happens, namely,
the I/O in question is failed and the device is marked to prevent any
further I/O.  Apparently calling check_disk_change() from the next
sd_open() will at least clear the changed field of the scsi_device,
thereby allowing I/O to the device.  But, the cached inquiry fields
are not updated (possibly via scsi_probe_lun()) to reflect the
possibly new device identity.

> regards,
> cvaroqui
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Mid-layer handling of NOT_READY conditions...
@ 2005-01-31 16:56 James.Smart
  2005-01-31 17:36 ` Patrick Mansfield
  2005-01-31 18:22 ` Andrew Vasquez
  0 siblings, 2 replies; 15+ messages in thread
From: James.Smart @ 2005-01-31 16:56 UTC (permalink / raw)
  To: andrew.vasquez, patmans; +Cc: James.Bottomley, linux-scsi

> On Sat, 2005-01-29 at 11:34 -0800, Patrick Mansfield wrote:
> > On Sat, Jan 29, 2005 at 10:44:41AM -0600, James Bottomley wrote:
> > > On Fri, 2005-01-28 at 21:46 -0800, Andrew Vasquez wrote:
> > > > Returning back DID_IMM_RETRY for these 'transport' 
> related conditions
> > > > would of course help in this issue -- but at the same 
> time bring with it
> > > > several side-effects which may not be desirable.
> > > > 
> > > > So, beyond this particular circumstance, what would be 
> considered a
> > > > 'proper' return status for this type of event? 
> > > 
> > > Well, the correct return, since this is a condition from 
> the storage, is
> > > simply the check condition and the sense code (rather 
> than having the
> > > driver interpret it).
> > 
> > But the transport hit a failure, not the storage device.
> > 
> > I thought Andrew hit this sequence:
> > 
> > 	- pull / replace cable
> > 
> > 	- IO resumes but gets NOT_READY (the device could be 
> logging back
> > 	  into the fibre or such)
> > 
> > 	- a FC transport problem is hit, DID_BUSY_BUSY is returned, but
> > 	  scmd->retries has already been exhausted by the NOT_READY
> > 
> > Did I misread something?
> > 
> 
> No, that's correct -- sorry about the confusion my second 
> email caused.
> I had only inquired about the 'correct' return status in the 
> context of
> avoiding the (cmd-retries > cmd->allowed) failure.

So this maps into the fc_target_block/unblock functionality that was
added to the fc class...  Adapter notifies driver of cable loss and
starts the block, driver does not "resume" the traffic until the
firmware says the login, etc has the device ready to accept scsi
traffic (Note: it does not guarantee the device can't respond with
a NOT_READY sense code).  If the transport hits a problem, there's
no harm done as long as the problem is resolved within the block
timeout. If the timeout is hit - it's because the user dicated that
it wanted to know of errors within this time and if the device fails,
it fails...

In the multipath solution - the "block" time used by the transport gets
set to 0 (or 1 second), so the i/o fails quickly and the multipath
function can kick in.

I am not a fan of a driver manufacturing a NOT_READY condition...

> > 
> > Why not just set scmd->retries to zero in scsi_requeue_command()?
> > 
> 
> This is exactly what I was thinking would be a fairly straight-forward
> approach at solving the problem...

This is ultimately a hack, and raises the potential for the retries value
to perpetually be rezero'd.  The better solution is the use the block
primitives available to avoid the i/o being issued at all if the transport
can't handle it.

James S

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mid-layer handling of NOT_READY conditions...
  2005-01-31 16:56 Mid-layer handling of NOT_READY conditions James.Smart
@ 2005-01-31 17:36 ` Patrick Mansfield
  2005-02-01  7:21   ` Andrew Vasquez
  2005-01-31 18:22 ` Andrew Vasquez
  1 sibling, 1 reply; 15+ messages in thread
From: Patrick Mansfield @ 2005-01-31 17:36 UTC (permalink / raw)
  To: James.Smart; +Cc: andrew.vasquez, James.Bottomley, linux-scsi

On Mon, Jan 31, 2005 at 11:56:02AM -0500, James.Smart@Emulex.Com wrote:
> > On Sat, 2005-01-29 at 11:34 -0800, Patrick Mansfield wrote:

> > > 
> > > Why not just set scmd->retries to zero in scsi_requeue_command()?
> > > 
> > 
> > This is exactly what I was thinking would be a fairly straight-forward
> > approach at solving the problem...
> 
> This is ultimately a hack, and raises the potential for the retries value
> to perpetually be rezero'd.  The better solution is the use the block
> primitives available to avoid the i/o being issued at all if the transport
> can't handle it.

No, it does not change the potential to retry forever, someone still has
to requeue the IO again outside of the NEEDS_RETRY/scsi_retry_command case
for that to happen.

We only check retries in scsi_decide_disposition (well not counting error
handling), and if we hit the limit, return SUCCESS. The change is that we
reset retries to zero if the command is *not* retried via
NEEDS_RETRY/scsi_retry_command.

It would be even clearer to zero retries in scsi_decide_disposition.

For NOT_READY, we would be better off always using the
scsi_requeue_command path ever: get rid of the check in scsi_check_sense,
as it will be requeued via scsi_io_completion code. This would have to
happen even if delaying retries to NOT_READY devices.

But yes, it is better to stop IO if the transport can't handle it, and
would likely avoid the problem (if we only got NOT_READY's and never
returned DID_BUS_BUSY). 

But it is still a bug to not reset retries.

Maybe I need to hack scsi_debug to demonstrate the problem ...

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mid-layer handling of NOT_READY conditions...
  2005-01-31 17:36 ` Patrick Mansfield
@ 2005-02-01  7:21   ` Andrew Vasquez
  0 siblings, 0 replies; 15+ messages in thread
From: Andrew Vasquez @ 2005-02-01  7:21 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: James.Smart, James.Bottomley, linux-scsi

On Mon, 2005-01-31 at 09:36 -0800, Patrick Mansfield wrote:
> On Mon, Jan 31, 2005 at 11:56:02AM -0500, James.Smart@Emulex.Com wrote:
> > > On Sat, 2005-01-29 at 11:34 -0800, Patrick Mansfield wrote:
> 
> > > > 
> > > > Why not just set scmd->retries to zero in scsi_requeue_command()?
> > > > 
> > > 
> > > This is exactly what I was thinking would be a fairly straight-forward
> > > approach at solving the problem...
> > 
> > This is ultimately a hack, and raises the potential for the retries value
> > to perpetually be rezero'd.  The better solution is the use the block
> > primitives available to avoid the i/o being issued at all if the transport
> > can't handle it.
> 
> No, it does not change the potential to retry forever, someone still has
> to requeue the IO again outside of the NEEDS_RETRY/scsi_retry_command case
> for that to happen.
> 
> We only check retries in scsi_decide_disposition (well not counting error
> handling), and if we hit the limit, return SUCCESS. The change is that we
> reset retries to zero if the command is *not* retried via
> NEEDS_RETRY/scsi_retry_command.
> 
> It would be even clearer to zero retries in scsi_decide_disposition.
> 
> For NOT_READY, we would be better off always using the
> scsi_requeue_command path ever: get rid of the check in scsi_check_sense,
> as it will be requeued via scsi_io_completion code. This would have to
> happen even if delaying retries to NOT_READY devices.
> 

Here's a small patch against the latest scsi-rc-fixes tree I've been
running with which allows my basic cable-pull test to complete without
incident.

Please consider for inclusion.


As per Patrick M's suggestions:

      * reset a command's retries count in scsi_decide_disposition() in
        case of additional requiring by upper layer.
      * remove redundant check for NOT_READY (ASC: 0x04 ASCQ: 0x01) in
        scsi_check_sense().

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>

 scsi_error.c |    9 +++------
 1 files changed, 3 insertions(+), 6 deletions(-)

===== drivers/scsi/scsi_error.c 1.86 vs edited =====
--- 1.86/drivers/scsi/scsi_error.c	2005-01-17 22:54:45 -08:00
+++ edited/drivers/scsi/scsi_error.c	2005-01-31 23:01:54 -08:00
@@ -327,12 +327,6 @@ static int scsi_check_sense(struct scsi_
 			return NEEDS_RETRY;
 		}
 		/*
-		 * if the device is in the process of becoming ready, we 
-		 * should retry.
-		 */
-		if ((sshdr.asc == 0x04) && (sshdr.ascq == 0x01))
-			return NEEDS_RETRY;
-		/*
 		 * if the device is not started, we need to wake
 		 * the error handler to start the motor
 		 */
@@ -1405,7 +1399,10 @@ int scsi_decide_disposition(struct scsi_
 	} else {
 		/*
 		 * no more retries - report this one back to upper level.
+		 * clear retries in case the command is requeued by
+		 * upper level.
 		 */
+		scmd->retries = 0;
 		return SUCCESS;
 	}
 }

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Mid-layer handling of NOT_READY conditions...
  2005-01-31 16:56 Mid-layer handling of NOT_READY conditions James.Smart
  2005-01-31 17:36 ` Patrick Mansfield
@ 2005-01-31 18:22 ` Andrew Vasquez
  1 sibling, 0 replies; 15+ messages in thread
From: Andrew Vasquez @ 2005-01-31 18:22 UTC (permalink / raw)
  To: James.Smart; +Cc: patmans, James.Bottomley, linux-scsi

On Mon, 2005-01-31 at 11:56 -0500, James.Smart@Emulex.Com wrote:
> > On Sat, 2005-01-29 at 11:34 -0800, Patrick Mansfield wrote:
> > > On Sat, Jan 29, 2005 at 10:44:41AM -0600, James Bottomley wrote:
> > > > On Fri, 2005-01-28 at 21:46 -0800, Andrew Vasquez wrote:
> > > > > Returning back DID_IMM_RETRY for these 'transport' 
> > related conditions
> > > > > would of course help in this issue -- but at the same 
> > time bring with it
> > > > > several side-effects which may not be desirable.
> > > > > 
> > > > > So, beyond this particular circumstance, what would be 
> > considered a
> > > > > 'proper' return status for this type of event? 
> > > > 
> > > > Well, the correct return, since this is a condition from 
> > the storage, is
> > > > simply the check condition and the sense code (rather 
> > than having the
> > > > driver interpret it).
> > > 
> > > But the transport hit a failure, not the storage device.
> > > 
> > > I thought Andrew hit this sequence:
> > > 
> > > 	- pull / replace cable
> > > 
> > > 	- IO resumes but gets NOT_READY (the device could be 
> > logging back
> > > 	  into the fibre or such)
> > > 
> > > 	- a FC transport problem is hit, DID_BUSY_BUSY is returned, but
> > > 	  scmd->retries has already been exhausted by the NOT_READY
> > > 
> > > Did I misread something?
> > > 
> > 
> > No, that's correct -- sorry about the confusion my second 
> > email caused.
> > I had only inquired about the 'correct' return status in the 
> > context of
> > avoiding the (cmd-retries > cmd->allowed) failure.
> 
> So this maps into the fc_target_block/unblock functionality that was
> added to the fc class...  Adapter notifies driver of cable loss and
> starts the block, driver does not "resume" the traffic until the
> firmware says the login, etc

Yes.

>  has the device ready to accept scsi
> traffic (Note: it does not guarantee the device can't respond with
> a NOT_READY sense code).

Exactly.

>   If the transport hits a problem, there's
> no harm done as long as the problem is resolved within the block
> timeout. If the timeout is hit - it's because the user dicated that
> it wanted to know of errors within this time and if the device fails,
> it fails...
> 
> In the multipath solution - the "block" time used by the transport gets
> set to 0 (or 1 second), so the i/o fails quickly and the multipath
> function can kick in.
> 

A bit confused now, are you proposing that cmd->timeout_per_command time
be inclusive of potential transport failures resulting in a requested
retry?  And thus not be refreshed (as it currently is) upon retry
request.

> I am not a fan of a driver manufacturing a NOT_READY condition...
> 

Again -- there is no manufacturing of check-conditions. Their existence
only highlighted the point that the retries value was being exhausted
(quickly) during the state and thus restricts a LLDD's ability to return
any status which would initiate a normal retry (i.e. DID_BUS_BUSY).

> > > 
> > > Why not just set scmd->retries to zero in scsi_requeue_command()?
> > > 
> > 
> > This is exactly what I was thinking would be a fairly straight-forward
> > approach at solving the problem...
> 
> This is ultimately a hack, and raises the potential for the retries value
> to perpetually be rezero'd.  The better solution is the use the block
> primitives available to avoid the i/o being issued at all if the transport
> can't handle it.
> 

Agree -- the midlayer internally plugging a device for a small period of
time while some NOT_READY (and any other similar) state is received from
the storage is the more appropriate direction.   Perhaps there could be
a combination of timing conditionals -- the fc_starget_dev_loss_tmo() to
time the overall pause in 'not-ready' plugging and a
period-to-wakeup-and-ping-the-storage time within the window?

--
av

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Mid-layer handling of NOT_READY conditions...
@ 2005-01-31 19:07 James.Smart
  0 siblings, 0 replies; 15+ messages in thread
From: James.Smart @ 2005-01-31 19:07 UTC (permalink / raw)
  To: andrew.vasquez; +Cc: patmans, James.Bottomley, linux-scsi

> >   If the transport hits a problem, there's
> > no harm done as long as the problem is resolved within the block
> > timeout. If the timeout is hit - it's because the user dicated that
> > it wanted to know of errors within this time and if the 
> device fails,
> > it fails...
> > 
> > In the multipath solution - the "block" time used by the 
> transport gets
> > set to 0 (or 1 second), so the i/o fails quickly and the multipath
> > function can kick in.
> > 
> 
> A bit confused now, are you proposing that 
> cmd->timeout_per_command time
> be inclusive of potential transport failures resulting in a requested
> retry?  And thus not be refreshed (as it currently is) upon retry
> request.

Nope. I knew I probably said the wrong thing here...

Let the io timeout stays as is. If the block condition occurs - it's up
to the driver take the right action on whether the i/o was killed as a
result or not. If it kills it, the driver should error it right away,
then the block code will hold off the retry. If it isn't killed, it lets
the i/o timer fire normally, which may invoke the eh_abort handler
(which is not held off by the block). After the abort, the retry would
be held off by the blocked state.

You want the dev_loss timer low so that other i/o will get through, and
can fail, thus invoking the device state change (and subsequent aborts
of the pending i/o's).

> Perhaps there could be
> a combination of timing conditionals -- the 
> fc_starget_dev_loss_tmo() to
> time the overall pause in 'not-ready' plugging and a
> period-to-wakeup-and-ping-the-storage time within the window?

Agree. I assume the ping-the-storage function is something done as
part of the multipathing so that it has an accurate state of the path.

-- james

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2005-02-01  7:21 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-28 23:24 Mid-layer handling of NOT_READY conditions Andrew Vasquez
2005-01-29  5:46 ` Andrew Vasquez
2005-01-29 16:16   ` Matthew Wilcox
2005-01-29 16:44   ` James Bottomley
2005-01-29 19:34     ` Patrick Mansfield
2005-01-30  1:40       ` James Bottomley
2005-01-30  2:33       ` Douglas Gilbert
2005-01-31  7:47       ` Andrew Vasquez
  -- strict thread matches above, loose matches on Subject: below --
2005-01-31  9:46 Mid-Layer handling of NOT READY conditions EXT / DEVOTEAM VAROQUI Christophe
2005-01-31 14:07 goggin, edward
2005-01-31 16:56 Mid-layer handling of NOT_READY conditions James.Smart
2005-01-31 17:36 ` Patrick Mansfield
2005-02-01  7:21   ` Andrew Vasquez
2005-01-31 18:22 ` Andrew Vasquez
2005-01-31 19:07 James.Smart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox