[PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
@ 2009-02-20 11:14 Rengarajan, Narayanan (STSD)
  2009-02-20 15:36 ` James Bottomley
  0 siblings, 1 reply; 11+ messages in thread
From: Rengarajan, Narayanan (STSD) @ 2009-02-20 11:14 UTC (permalink / raw)
  To: linux-scsi@vger.kernel.org

Hi,

  Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time in 2.6.27 kernel.

 Steps to reproduce:
 1. present a standby lun to the host
 2. do a discovery from the host (scan the scsi bus)
 3. Spinning of disks is  observed in  /var/log/messages

Whenever a device goes offline and comes back, the new sd device takes longer
time to get created. This is because of the spinning up of disk in
sd_spinup_disk fuction as the standby paths would return device not ready with
0x04/0x0b asc/ascq.

Recommended patch :

  diff -pNaur /usr/src/linux/drivers/scsi/sd.c sd.c
--- /usr/src/linux/drivers/scsi/sd.c    2009-02-09 22:24:56.000000000 +0530
+++ sd.c        2009-02-19 16:39:16.000000000 +0530
@@ -1181,8 +1181,8 @@ sd_spinup_disk(struct scsi_disk *sdkp)
                 */
                if (sense_valid &&
                    sshdr.sense_key == NOT_READY &&
-                   sshdr.asc == 4 && sshdr.ascq == 3) {
-                       break;          /* manual intervention required */
+                   sshdr.asc == 4 && (sshdr.ascq == 3 || sshdr.ascq == 0x0b ||
sshdr.ascq == 0x0c) ) {
+                       break;  /* manual intervention required || Standby ||
Unavailable */

                /*
                 * Issue command to spin up drive when not ready

Thanks,
Narayanan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
  2009-02-20 11:14 [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time Rengarajan, Narayanan (STSD)
@ 2009-02-20 15:36 ` James Bottomley
  2009-02-20 15:52   ` Matthew Wilcox
  0 siblings, 1 reply; 11+ messages in thread
From: James Bottomley @ 2009-02-20 15:36 UTC (permalink / raw)
  To: Rengarajan, Narayanan (STSD); +Cc: linux-scsi@vger.kernel.org

On Fri, 2009-02-20 at 11:14 +0000, Rengarajan, Narayanan (STSD) wrote:
> Hi,
> 
>   Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time in 2.6.27 kernel.
> 
>  Steps to reproduce:
>  1. present a standby lun to the host
>  2. do a discovery from the host (scan the scsi bus)
>  3. Spinning of disks is  observed in  /var/log/messages
> 
> Whenever a device goes offline and comes back, the new sd device takes longer
> time to get created. This is because of the spinning up of disk in
> sd_spinup_disk fuction as the standby paths would return device not ready with
> 0x04/0x0b asc/ascq.
> 
> Recommended patch :
> 
>   diff -pNaur /usr/src/linux/drivers/scsi/sd.c sd.c
> --- /usr/src/linux/drivers/scsi/sd.c    2009-02-09 22:24:56.000000000 +0530
> +++ sd.c        2009-02-19 16:39:16.000000000 +0530
> @@ -1181,8 +1181,8 @@ sd_spinup_disk(struct scsi_disk *sdkp)
>                  */
>                 if (sense_valid &&
>                     sshdr.sense_key == NOT_READY &&
> -                   sshdr.asc == 4 && sshdr.ascq == 3) {
> -                       break;          /* manual intervention required */
> +                   sshdr.asc == 4 && (sshdr.ascq == 3 || sshdr.ascq == 0x0b ||
> sshdr.ascq == 0x0c) ) {
> +                       break;  /* manual intervention required || Standby ||

This really doesn't look right ASC/ASCQ 0x04/0x0b is LUN not accessible;
target *port* in standby state. That's supposed to be because it was put
into a standby state according to SPC3(r23) 5.8.2.4.4

I don't see how a port (target) is expected to come out of standby with
a LUN command.  The standard implies you need to do it with a set target
port groups command.  What array is actually giving this?

James

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
  2009-02-20 15:36 ` James Bottomley
@ 2009-02-20 15:52   ` Matthew Wilcox
  2009-02-20 16:03     ` James Bottomley
  0 siblings, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2009-02-20 15:52 UTC (permalink / raw)
  To: James Bottomley; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org

On Fri, Feb 20, 2009 at 03:36:22PM +0000, James Bottomley wrote:
> > +                   sshdr.asc == 4 && (sshdr.ascq == 3 || sshdr.ascq == 0x0b ||
> > sshdr.ascq == 0x0c) ) {
> > +                       break;  /* manual intervention required || Standby ||
> 
> This really doesn't look right ASC/ASCQ 0x04/0x0b is LUN not accessible;
> target *port* in standby state. That's supposed to be because it was put
> into a standby state according to SPC3(r23) 5.8.2.4.4
> 
> I don't see how a port (target) is expected to come out of standby with
> a LUN command.  The standard implies you need to do it with a set target
> port groups command.  What array is actually giving this?

The port isn't coming out of standby state.  We send it a TEST_UNIT_READY,
it replies with a 0x04/0x0b.  At that point, we currently decide to send
it a START_STOP and wait 100 seconds.  This is clearly a crappy decision
on our part, we should just bail.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
  2009-02-20 15:52   ` Matthew Wilcox
@ 2009-02-20 16:03     ` James Bottomley
  2009-02-20 16:13       ` Matthew Wilcox
  0 siblings, 1 reply; 11+ messages in thread
From: James Bottomley @ 2009-02-20 16:03 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org

On Fri, 2009-02-20 at 08:52 -0700, Matthew Wilcox wrote:
> On Fri, Feb 20, 2009 at 03:36:22PM +0000, James Bottomley wrote:
> > > +                   sshdr.asc == 4 && (sshdr.ascq == 3 || sshdr.ascq == 0x0b ||
> > > sshdr.ascq == 0x0c) ) {
> > > +                       break;  /* manual intervention required || Standby ||
> > 
> > This really doesn't look right ASC/ASCQ 0x04/0x0b is LUN not accessible;
> > target *port* in standby state. That's supposed to be because it was put
> > into a standby state according to SPC3(r23) 5.8.2.4.4
> > 
> > I don't see how a port (target) is expected to come out of standby with
> > a LUN command.  The standard implies you need to do it with a set target
> > port groups command.  What array is actually giving this?
> 
> The port isn't coming out of standby state.  We send it a TEST_UNIT_READY,
> it replies with a 0x04/0x0b.  At that point, we currently decide to send
> it a START_STOP and wait 100 seconds.  This is clearly a crappy decision
> on our part, we should just bail.

So we should be bailing on manual intervention, TP standby and TP
unavailable?  It looks like TP assymetric access transition is waitable.

It also looks like offline and notify (enable spinup) required are also
not worth waiting for ... although the latter is a SAS power management
state which it's not clear to me how to handle properly.

James



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
  2009-02-20 16:03     ` James Bottomley
@ 2009-02-20 16:13       ` Matthew Wilcox
  2009-02-20 16:24         ` James Bottomley
  0 siblings, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2009-02-20 16:13 UTC (permalink / raw)
  To: James Bottomley; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org

On Fri, Feb 20, 2009 at 10:03:15AM -0600, James Bottomley wrote:
> > The port isn't coming out of standby state.  We send it a TEST_UNIT_READY,
> > it replies with a 0x04/0x0b.  At that point, we currently decide to send
> > it a START_STOP and wait 100 seconds.  This is clearly a crappy decision
> > on our part, we should just bail.
> 
> So we should be bailing on manual intervention, TP standby and TP
> unavailable?  It looks like TP assymetric access transition is waitable.

I think that's correct (and I think my version of this patch makes that
clearer).

SPC 4r14 isn't clear on 'Asymmetric Access Transition' -- I can't tell
whether that state is entered on transition *to* active, or *from*
active, or both.

> It also looks like offline and notify (enable spinup) required are also
> not worth waiting for ... although the latter is a SAS power management
> state which it's not clear to me how to handle properly.

Offline is only applicable to M and V (Media Changer and Automation)
devices, neither of which should be attached to by sd.

I don't know what 'Enable Spinup' is for -- maybe Doug knows?  Sending a
START_STOP to the device might be exactly what they intend for us to do.
Under a 'First, Do No Harm' theory, perhaps we should leave well enough
alone and just add Standby and Unavailable?

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
  2009-02-20 16:13       ` Matthew Wilcox
@ 2009-02-20 16:24         ` James Bottomley
  2009-02-20 17:04           ` Matthew Wilcox
  0 siblings, 1 reply; 11+ messages in thread
From: James Bottomley @ 2009-02-20 16:24 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org

On Fri, 2009-02-20 at 09:13 -0700, Matthew Wilcox wrote:
> On Fri, Feb 20, 2009 at 10:03:15AM -0600, James Bottomley wrote:
> > > The port isn't coming out of standby state.  We send it a TEST_UNIT_READY,
> > > it replies with a 0x04/0x0b.  At that point, we currently decide to send
> > > it a START_STOP and wait 100 seconds.  This is clearly a crappy decision
> > > on our part, we should just bail.
> > 
> > So we should be bailing on manual intervention, TP standby and TP
> > unavailable?  It looks like TP assymetric access transition is waitable.
> 
> I think that's correct (and I think my version of this patch makes that
> clearer).
> 
> SPC 4r14 isn't clear on 'Asymmetric Access Transition' -- I can't tell
> whether that state is entered on transition *to* active, or *from*
> active, or both.

I think the point is that it's a transition:  Once it comes out of it we
either get access or we don't, so it's worth waiting to see what
happens.

> > It also looks like offline and notify (enable spinup) required are also
> > not worth waiting for ... although the latter is a SAS power management
> > state which it's not clear to me how to handle properly.
> 
> Offline is only applicable to M and V (Media Changer and Automation)
> devices, neither of which should be attached to by sd.

Makes sense

> I don't know what 'Enable Spinup' is for -- maybe Doug knows?  Sending a
> START_STOP to the device might be exactly what they intend for us to do.
> Under a 'First, Do No Harm' theory, perhaps we should leave well enough
> alone and just add Standby and Unavailable?

As I said, it's a SAS power management command related condition:  The
drive is limited to consuming a certain level of power and that's not
enough to spin up, so it won't spin up regardless of how many start unit
commands it gets sent until the power management control is changed to
allow it to consume enough power for the spinup.  I think it's ignorable
for now ... it probably means that when power management is added we
need to get the transport classes involved to send the appropriate sas
pm command.

James



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
  2009-02-20 16:24         ` James Bottomley
@ 2009-02-20 17:04           ` Matthew Wilcox
  2009-02-23 11:48             ` Rengarajan, Narayanan (STSD)
  0 siblings, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2009-02-20 17:04 UTC (permalink / raw)
  To: James Bottomley; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org

On Fri, Feb 20, 2009 at 04:24:08PM +0000, James Bottomley wrote:
> I think the point is that it's a transition:  Once it comes out of it we
> either get access or we don't, so it's worth waiting to see what
> happens.

I agree.  If we knew that it could only be transitioning to inactive, we
could skip it.  This state probably only gets returned once in a blue
moon anyway.

> > I don't know what 'Enable Spinup' is for -- maybe Doug knows?  Sending a
> > START_STOP to the device might be exactly what they intend for us to do.
> > Under a 'First, Do No Harm' theory, perhaps we should leave well enough
> > alone and just add Standby and Unavailable?
> 
> As I said, it's a SAS power management command related condition:  The
> drive is limited to consuming a certain level of power and that's not
> enough to spin up, so it won't spin up regardless of how many start unit
> commands it gets sent until the power management control is changed to
> allow it to consume enough power for the spinup.  I think it's ignorable
> for now ... it probably means that when power management is added we
> need to get the transport classes involved to send the appropriate sas
> pm command.

Looking at SAS2r14, I see that NOTIFY (ENABLE SPINUP) is a primitive,
not a command.  If the SAS device is attached through an expander, I
don't think we have a way to send that primitive to the device.  We must
wait for the expander to send it.  If it's sirectly-connected, the
initiator port is supposed to send it.  Presumably this is handled
either by firmware on the HBA or by the HBA driver; either way, we don't
seem to have a way today to get the HBA to send this primitive.

I think our current behaviour is correct for this command, so the patch
here: http://marc.info/?l=linux-scsi&m=123513805527153&w=2 is correct.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
  2009-02-20 17:04           ` Matthew Wilcox
@ 2009-02-23 11:48             ` Rengarajan, Narayanan (STSD)
  2009-02-23 14:52               ` James Bottomley
  0 siblings, 1 reply; 11+ messages in thread
From: Rengarajan, Narayanan (STSD) @ 2009-02-23 11:48 UTC (permalink / raw)
  To: Matthew Wilcox, James Bottomley; +Cc: linux-scsi@vger.kernel.org

If this patch is valid , when can we expect this on mainstream kernel . I can help testing this patch when included in the kernel.

Narayanan
  

-----Original Message-----
From: Matthew Wilcox [mailto:matthew@wil.cx] 
Sent: Friday, February 20, 2009 10:35 PM
To: James Bottomley
Cc: Rengarajan, Narayanan (STSD); linux-scsi@vger.kernel.org
Subject: Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.

On Fri, Feb 20, 2009 at 04:24:08PM +0000, James Bottomley wrote:
> I think the point is that it's a transition:  Once it comes out of it 
> we either get access or we don't, so it's worth waiting to see what 
> happens.

I agree.  If we knew that it could only be transitioning to inactive, we could skip it.  This state probably only gets returned once in a blue moon anyway.

> > I don't know what 'Enable Spinup' is for -- maybe Doug knows?  
> > Sending a START_STOP to the device might be exactly what they intend for us to do.
> > Under a 'First, Do No Harm' theory, perhaps we should leave well 
> > enough alone and just add Standby and Unavailable?
> 
> As I said, it's a SAS power management command related condition:  The 
> drive is limited to consuming a certain level of power and that's not 
> enough to spin up, so it won't spin up regardless of how many start 
> unit commands it gets sent until the power management control is 
> changed to allow it to consume enough power for the spinup.  I think 
> it's ignorable for now ... it probably means that when power 
> management is added we need to get the transport classes involved to 
> send the appropriate sas pm command.

Looking at SAS2r14, I see that NOTIFY (ENABLE SPINUP) is a primitive, not a command.  If the SAS device is attached through an expander, I don't think we have a way to send that primitive to the device.  We must wait for the expander to send it.  If it's sirectly-connected, the initiator port is supposed to send it.  Presumably this is handled either by firmware on the HBA or by the HBA driver; either way, we don't seem to have a way today to get the HBA to send this primitive.

I think our current behaviour is correct for this command, so the patch
here: http://marc.info/?l=linux-scsi&m=123513805527153&w=2 is correct.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours.  We can't possibly take such a retrograde step."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
  2009-02-23 11:48             ` Rengarajan, Narayanan (STSD)
@ 2009-02-23 14:52               ` James Bottomley
  2009-02-28 21:26                 ` Matthew Wilcox
  0 siblings, 1 reply; 11+ messages in thread
From: James Bottomley @ 2009-02-23 14:52 UTC (permalink / raw)
  To: Rengarajan, Narayanan (STSD); +Cc: Matthew Wilcox, linux-scsi@vger.kernel.org

On Mon, 2009-02-23 at 11:48 +0000, Rengarajan, Narayanan (STSD) wrote:
> If this patch is valid , when can we expect this on mainstream
> kernel . I can help testing this patch when included in the kernel.

If you could test it and report that it works, probably fairly
immediately.

James



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
  2009-02-23 14:52               ` James Bottomley
@ 2009-02-28 21:26                 ` Matthew Wilcox
  2009-02-28 23:56                   ` James Bottomley
  0 siblings, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2009-02-28 21:26 UTC (permalink / raw)
  To: James Bottomley; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org

On Mon, Feb 23, 2009 at 08:52:28AM -0600, James Bottomley wrote:
> On Mon, 2009-02-23 at 11:48 +0000, Rengarajan, Narayanan (STSD) wrote:
> > If this patch is valid , when can we expect this on mainstream
> > kernel . I can help testing this patch when included in the kernel.
> 
> If you could test it and report that it works, probably fairly
> immediately.

Narayanan reported that the patch I sent worked, but it wasn't in the push
to Linus you just sent.  I'm not quite sure whether 'fairly immediately'
meant 'for 2.6.29' or 'for 2.6.30', so I don't know whether this was an
oversight or an intentional omission.  In case it was the former, I'm
alerting you to it ;-)

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
  2009-02-28 21:26                 ` Matthew Wilcox
@ 2009-02-28 23:56                   ` James Bottomley
  0 siblings, 0 replies; 11+ messages in thread
From: James Bottomley @ 2009-02-28 23:56 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org

On Sat, 2009-02-28 at 14:26 -0700, Matthew Wilcox wrote:
> On Mon, Feb 23, 2009 at 08:52:28AM -0600, James Bottomley wrote:
> > On Mon, 2009-02-23 at 11:48 +0000, Rengarajan, Narayanan (STSD) wrote:
> > > If this patch is valid , when can we expect this on mainstream
> > > kernel . I can help testing this patch when included in the kernel.
> > 
> > If you could test it and report that it works, probably fairly
> > immediately.
> 
> Narayanan reported that the patch I sent worked, but it wasn't in the push
> to Linus you just sent.  I'm not quite sure whether 'fairly immediately'
> meant 'for 2.6.29' or 'for 2.6.30', so I don't know whether this was an
> oversight or an intentional omission.  In case it was the former, I'm
> alerting you to it ;-)

It was intentional ... I've been trying to incubate rc-fixes in the next
tree for a few days (which means I can't add to the rc-fixes tree after
about Monday if I want to send it to Linus on Friday/Saturday), so it
will be in the next push.

James



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-02-28 23:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-20 11:14 [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time Rengarajan, Narayanan (STSD)
2009-02-20 15:36 ` James Bottomley
2009-02-20 15:52   ` Matthew Wilcox
2009-02-20 16:03     ` James Bottomley
2009-02-20 16:13       ` Matthew Wilcox
2009-02-20 16:24         ` James Bottomley
2009-02-20 17:04           ` Matthew Wilcox
2009-02-23 11:48             ` Rengarajan, Narayanan (STSD)
2009-02-23 14:52               ` James Bottomley
2009-02-28 21:26                 ` Matthew Wilcox
2009-02-28 23:56                   ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox