* [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time.
@ 2009-02-20 11:14 Rengarajan, Narayanan (STSD)
2009-02-20 15:36 ` James Bottomley
0 siblings, 1 reply; 11+ messages in thread
From: Rengarajan, Narayanan (STSD) @ 2009-02-20 11:14 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org
Hi,
Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time in 2.6.27 kernel.
Steps to reproduce:
1. present a standby lun to the host
2. do a discovery from the host (scan the scsi bus)
3. Spinning of disks is observed in /var/log/messages
Whenever a device goes offline and comes back, the new sd device takes longer
time to get created. This is because of the spinning up of disk in
sd_spinup_disk fuction as the standby paths would return device not ready with
0x04/0x0b asc/ascq.
Recommended patch :
diff -pNaur /usr/src/linux/drivers/scsi/sd.c sd.c
--- /usr/src/linux/drivers/scsi/sd.c 2009-02-09 22:24:56.000000000 +0530
+++ sd.c 2009-02-19 16:39:16.000000000 +0530
@@ -1181,8 +1181,8 @@ sd_spinup_disk(struct scsi_disk *sdkp)
*/
if (sense_valid &&
sshdr.sense_key == NOT_READY &&
- sshdr.asc == 4 && sshdr.ascq == 3) {
- break; /* manual intervention required */
+ sshdr.asc == 4 && (sshdr.ascq == 3 || sshdr.ascq == 0x0b ||
sshdr.ascq == 0x0c) ) {
+ break; /* manual intervention required || Standby ||
Unavailable */
/*
* Issue command to spin up drive when not ready
Thanks,
Narayanan
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. 2009-02-20 11:14 [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time Rengarajan, Narayanan (STSD) @ 2009-02-20 15:36 ` James Bottomley 2009-02-20 15:52 ` Matthew Wilcox 0 siblings, 1 reply; 11+ messages in thread From: James Bottomley @ 2009-02-20 15:36 UTC (permalink / raw) To: Rengarajan, Narayanan (STSD); +Cc: linux-scsi@vger.kernel.org On Fri, 2009-02-20 at 11:14 +0000, Rengarajan, Narayanan (STSD) wrote: > Hi, > > Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time in 2.6.27 kernel. > > Steps to reproduce: > 1. present a standby lun to the host > 2. do a discovery from the host (scan the scsi bus) > 3. Spinning of disks is observed in /var/log/messages > > Whenever a device goes offline and comes back, the new sd device takes longer > time to get created. This is because of the spinning up of disk in > sd_spinup_disk fuction as the standby paths would return device not ready with > 0x04/0x0b asc/ascq. > > Recommended patch : > > diff -pNaur /usr/src/linux/drivers/scsi/sd.c sd.c > --- /usr/src/linux/drivers/scsi/sd.c 2009-02-09 22:24:56.000000000 +0530 > +++ sd.c 2009-02-19 16:39:16.000000000 +0530 > @@ -1181,8 +1181,8 @@ sd_spinup_disk(struct scsi_disk *sdkp) > */ > if (sense_valid && > sshdr.sense_key == NOT_READY && > - sshdr.asc == 4 && sshdr.ascq == 3) { > - break; /* manual intervention required */ > + sshdr.asc == 4 && (sshdr.ascq == 3 || sshdr.ascq == 0x0b || > sshdr.ascq == 0x0c) ) { > + break; /* manual intervention required || Standby || This really doesn't look right ASC/ASCQ 0x04/0x0b is LUN not accessible; target *port* in standby state. That's supposed to be because it was put into a standby state according to SPC3(r23) 5.8.2.4.4 I don't see how a port (target) is expected to come out of standby with a LUN command. The standard implies you need to do it with a set target port groups command. What array is actually giving this? James ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. 2009-02-20 15:36 ` James Bottomley @ 2009-02-20 15:52 ` Matthew Wilcox 2009-02-20 16:03 ` James Bottomley 0 siblings, 1 reply; 11+ messages in thread From: Matthew Wilcox @ 2009-02-20 15:52 UTC (permalink / raw) To: James Bottomley; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org On Fri, Feb 20, 2009 at 03:36:22PM +0000, James Bottomley wrote: > > + sshdr.asc == 4 && (sshdr.ascq == 3 || sshdr.ascq == 0x0b || > > sshdr.ascq == 0x0c) ) { > > + break; /* manual intervention required || Standby || > > This really doesn't look right ASC/ASCQ 0x04/0x0b is LUN not accessible; > target *port* in standby state. That's supposed to be because it was put > into a standby state according to SPC3(r23) 5.8.2.4.4 > > I don't see how a port (target) is expected to come out of standby with > a LUN command. The standard implies you need to do it with a set target > port groups command. What array is actually giving this? The port isn't coming out of standby state. We send it a TEST_UNIT_READY, it replies with a 0x04/0x0b. At that point, we currently decide to send it a START_STOP and wait 100 seconds. This is clearly a crappy decision on our part, we should just bail. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. 2009-02-20 15:52 ` Matthew Wilcox @ 2009-02-20 16:03 ` James Bottomley 2009-02-20 16:13 ` Matthew Wilcox 0 siblings, 1 reply; 11+ messages in thread From: James Bottomley @ 2009-02-20 16:03 UTC (permalink / raw) To: Matthew Wilcox; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org On Fri, 2009-02-20 at 08:52 -0700, Matthew Wilcox wrote: > On Fri, Feb 20, 2009 at 03:36:22PM +0000, James Bottomley wrote: > > > + sshdr.asc == 4 && (sshdr.ascq == 3 || sshdr.ascq == 0x0b || > > > sshdr.ascq == 0x0c) ) { > > > + break; /* manual intervention required || Standby || > > > > This really doesn't look right ASC/ASCQ 0x04/0x0b is LUN not accessible; > > target *port* in standby state. That's supposed to be because it was put > > into a standby state according to SPC3(r23) 5.8.2.4.4 > > > > I don't see how a port (target) is expected to come out of standby with > > a LUN command. The standard implies you need to do it with a set target > > port groups command. What array is actually giving this? > > The port isn't coming out of standby state. We send it a TEST_UNIT_READY, > it replies with a 0x04/0x0b. At that point, we currently decide to send > it a START_STOP and wait 100 seconds. This is clearly a crappy decision > on our part, we should just bail. So we should be bailing on manual intervention, TP standby and TP unavailable? It looks like TP assymetric access transition is waitable. It also looks like offline and notify (enable spinup) required are also not worth waiting for ... although the latter is a SAS power management state which it's not clear to me how to handle properly. James ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. 2009-02-20 16:03 ` James Bottomley @ 2009-02-20 16:13 ` Matthew Wilcox 2009-02-20 16:24 ` James Bottomley 0 siblings, 1 reply; 11+ messages in thread From: Matthew Wilcox @ 2009-02-20 16:13 UTC (permalink / raw) To: James Bottomley; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org On Fri, Feb 20, 2009 at 10:03:15AM -0600, James Bottomley wrote: > > The port isn't coming out of standby state. We send it a TEST_UNIT_READY, > > it replies with a 0x04/0x0b. At that point, we currently decide to send > > it a START_STOP and wait 100 seconds. This is clearly a crappy decision > > on our part, we should just bail. > > So we should be bailing on manual intervention, TP standby and TP > unavailable? It looks like TP assymetric access transition is waitable. I think that's correct (and I think my version of this patch makes that clearer). SPC 4r14 isn't clear on 'Asymmetric Access Transition' -- I can't tell whether that state is entered on transition *to* active, or *from* active, or both. > It also looks like offline and notify (enable spinup) required are also > not worth waiting for ... although the latter is a SAS power management > state which it's not clear to me how to handle properly. Offline is only applicable to M and V (Media Changer and Automation) devices, neither of which should be attached to by sd. I don't know what 'Enable Spinup' is for -- maybe Doug knows? Sending a START_STOP to the device might be exactly what they intend for us to do. Under a 'First, Do No Harm' theory, perhaps we should leave well enough alone and just add Standby and Unavailable? -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. 2009-02-20 16:13 ` Matthew Wilcox @ 2009-02-20 16:24 ` James Bottomley 2009-02-20 17:04 ` Matthew Wilcox 0 siblings, 1 reply; 11+ messages in thread From: James Bottomley @ 2009-02-20 16:24 UTC (permalink / raw) To: Matthew Wilcox; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org On Fri, 2009-02-20 at 09:13 -0700, Matthew Wilcox wrote: > On Fri, Feb 20, 2009 at 10:03:15AM -0600, James Bottomley wrote: > > > The port isn't coming out of standby state. We send it a TEST_UNIT_READY, > > > it replies with a 0x04/0x0b. At that point, we currently decide to send > > > it a START_STOP and wait 100 seconds. This is clearly a crappy decision > > > on our part, we should just bail. > > > > So we should be bailing on manual intervention, TP standby and TP > > unavailable? It looks like TP assymetric access transition is waitable. > > I think that's correct (and I think my version of this patch makes that > clearer). > > SPC 4r14 isn't clear on 'Asymmetric Access Transition' -- I can't tell > whether that state is entered on transition *to* active, or *from* > active, or both. I think the point is that it's a transition: Once it comes out of it we either get access or we don't, so it's worth waiting to see what happens. > > It also looks like offline and notify (enable spinup) required are also > > not worth waiting for ... although the latter is a SAS power management > > state which it's not clear to me how to handle properly. > > Offline is only applicable to M and V (Media Changer and Automation) > devices, neither of which should be attached to by sd. Makes sense > I don't know what 'Enable Spinup' is for -- maybe Doug knows? Sending a > START_STOP to the device might be exactly what they intend for us to do. > Under a 'First, Do No Harm' theory, perhaps we should leave well enough > alone and just add Standby and Unavailable? As I said, it's a SAS power management command related condition: The drive is limited to consuming a certain level of power and that's not enough to spin up, so it won't spin up regardless of how many start unit commands it gets sent until the power management control is changed to allow it to consume enough power for the spinup. I think it's ignorable for now ... it probably means that when power management is added we need to get the transport classes involved to send the appropriate sas pm command. James ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. 2009-02-20 16:24 ` James Bottomley @ 2009-02-20 17:04 ` Matthew Wilcox 2009-02-23 11:48 ` Rengarajan, Narayanan (STSD) 0 siblings, 1 reply; 11+ messages in thread From: Matthew Wilcox @ 2009-02-20 17:04 UTC (permalink / raw) To: James Bottomley; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org On Fri, Feb 20, 2009 at 04:24:08PM +0000, James Bottomley wrote: > I think the point is that it's a transition: Once it comes out of it we > either get access or we don't, so it's worth waiting to see what > happens. I agree. If we knew that it could only be transitioning to inactive, we could skip it. This state probably only gets returned once in a blue moon anyway. > > I don't know what 'Enable Spinup' is for -- maybe Doug knows? Sending a > > START_STOP to the device might be exactly what they intend for us to do. > > Under a 'First, Do No Harm' theory, perhaps we should leave well enough > > alone and just add Standby and Unavailable? > > As I said, it's a SAS power management command related condition: The > drive is limited to consuming a certain level of power and that's not > enough to spin up, so it won't spin up regardless of how many start unit > commands it gets sent until the power management control is changed to > allow it to consume enough power for the spinup. I think it's ignorable > for now ... it probably means that when power management is added we > need to get the transport classes involved to send the appropriate sas > pm command. Looking at SAS2r14, I see that NOTIFY (ENABLE SPINUP) is a primitive, not a command. If the SAS device is attached through an expander, I don't think we have a way to send that primitive to the device. We must wait for the expander to send it. If it's sirectly-connected, the initiator port is supposed to send it. Presumably this is handled either by firmware on the HBA or by the HBA driver; either way, we don't seem to have a way today to get the HBA to send this primitive. I think our current behaviour is correct for this command, so the patch here: http://marc.info/?l=linux-scsi&m=123513805527153&w=2 is correct. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. 2009-02-20 17:04 ` Matthew Wilcox @ 2009-02-23 11:48 ` Rengarajan, Narayanan (STSD) 2009-02-23 14:52 ` James Bottomley 0 siblings, 1 reply; 11+ messages in thread From: Rengarajan, Narayanan (STSD) @ 2009-02-23 11:48 UTC (permalink / raw) To: Matthew Wilcox, James Bottomley; +Cc: linux-scsi@vger.kernel.org If this patch is valid , when can we expect this on mainstream kernel . I can help testing this patch when included in the kernel. Narayanan -----Original Message----- From: Matthew Wilcox [mailto:matthew@wil.cx] Sent: Friday, February 20, 2009 10:35 PM To: James Bottomley Cc: Rengarajan, Narayanan (STSD); linux-scsi@vger.kernel.org Subject: Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. On Fri, Feb 20, 2009 at 04:24:08PM +0000, James Bottomley wrote: > I think the point is that it's a transition: Once it comes out of it > we either get access or we don't, so it's worth waiting to see what > happens. I agree. If we knew that it could only be transitioning to inactive, we could skip it. This state probably only gets returned once in a blue moon anyway. > > I don't know what 'Enable Spinup' is for -- maybe Doug knows? > > Sending a START_STOP to the device might be exactly what they intend for us to do. > > Under a 'First, Do No Harm' theory, perhaps we should leave well > > enough alone and just add Standby and Unavailable? > > As I said, it's a SAS power management command related condition: The > drive is limited to consuming a certain level of power and that's not > enough to spin up, so it won't spin up regardless of how many start > unit commands it gets sent until the power management control is > changed to allow it to consume enough power for the spinup. I think > it's ignorable for now ... it probably means that when power > management is added we need to get the transport classes involved to > send the appropriate sas pm command. Looking at SAS2r14, I see that NOTIFY (ENABLE SPINUP) is a primitive, not a command. If the SAS device is attached through an expander, I don't think we have a way to send that primitive to the device. We must wait for the expander to send it. If it's sirectly-connected, the initiator port is supposed to send it. Presumably this is handled either by firmware on the HBA or by the HBA driver; either way, we don't seem to have a way today to get the HBA to send this primitive. I think our current behaviour is correct for this command, so the patch here: http://marc.info/?l=linux-scsi&m=123513805527153&w=2 is correct. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. 2009-02-23 11:48 ` Rengarajan, Narayanan (STSD) @ 2009-02-23 14:52 ` James Bottomley 2009-02-28 21:26 ` Matthew Wilcox 0 siblings, 1 reply; 11+ messages in thread From: James Bottomley @ 2009-02-23 14:52 UTC (permalink / raw) To: Rengarajan, Narayanan (STSD); +Cc: Matthew Wilcox, linux-scsi@vger.kernel.org On Mon, 2009-02-23 at 11:48 +0000, Rengarajan, Narayanan (STSD) wrote: > If this patch is valid , when can we expect this on mainstream > kernel . I can help testing this patch when included in the kernel. If you could test it and report that it works, probably fairly immediately. James ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. 2009-02-23 14:52 ` James Bottomley @ 2009-02-28 21:26 ` Matthew Wilcox 2009-02-28 23:56 ` James Bottomley 0 siblings, 1 reply; 11+ messages in thread From: Matthew Wilcox @ 2009-02-28 21:26 UTC (permalink / raw) To: James Bottomley; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org On Mon, Feb 23, 2009 at 08:52:28AM -0600, James Bottomley wrote: > On Mon, 2009-02-23 at 11:48 +0000, Rengarajan, Narayanan (STSD) wrote: > > If this patch is valid , when can we expect this on mainstream > > kernel . I can help testing this patch when included in the kernel. > > If you could test it and report that it works, probably fairly > immediately. Narayanan reported that the patch I sent worked, but it wasn't in the push to Linus you just sent. I'm not quite sure whether 'fairly immediately' meant 'for 2.6.29' or 'for 2.6.30', so I don't know whether this was an oversight or an intentional omission. In case it was the former, I'm alerting you to it ;-) -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time. 2009-02-28 21:26 ` Matthew Wilcox @ 2009-02-28 23:56 ` James Bottomley 0 siblings, 0 replies; 11+ messages in thread From: James Bottomley @ 2009-02-28 23:56 UTC (permalink / raw) To: Matthew Wilcox; +Cc: Rengarajan, Narayanan (STSD), linux-scsi@vger.kernel.org On Sat, 2009-02-28 at 14:26 -0700, Matthew Wilcox wrote: > On Mon, Feb 23, 2009 at 08:52:28AM -0600, James Bottomley wrote: > > On Mon, 2009-02-23 at 11:48 +0000, Rengarajan, Narayanan (STSD) wrote: > > > If this patch is valid , when can we expect this on mainstream > > > kernel . I can help testing this patch when included in the kernel. > > > > If you could test it and report that it works, probably fairly > > immediately. > > Narayanan reported that the patch I sent worked, but it wasn't in the push > to Linus you just sent. I'm not quite sure whether 'fairly immediately' > meant 'for 2.6.29' or 'for 2.6.30', so I don't know whether this was an > oversight or an intentional omission. In case it was the former, I'm > alerting you to it ;-) It was intentional ... I've been trying to incubate rc-fixes in the next tree for a few days (which means I can't add to the rc-fixes tree after about Monday if I want to send it to Linus on Friday/Saturday), so it will be in the next push. James ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2009-02-28 23:56 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-02-20 11:14 [PATCH 1/1] : Spinning up disk is observed on standby paths until timeout, resulting in longer path restoration time Rengarajan, Narayanan (STSD) 2009-02-20 15:36 ` James Bottomley 2009-02-20 15:52 ` Matthew Wilcox 2009-02-20 16:03 ` James Bottomley 2009-02-20 16:13 ` Matthew Wilcox 2009-02-20 16:24 ` James Bottomley 2009-02-20 17:04 ` Matthew Wilcox 2009-02-23 11:48 ` Rengarajan, Narayanan (STSD) 2009-02-23 14:52 ` James Bottomley 2009-02-28 21:26 ` Matthew Wilcox 2009-02-28 23:56 ` James Bottomley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox