Re: disk restart failure after suspend

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* Re: disk restart failure after suspend
       [not found]                 ` <20091016060320.GB30389@mac.home>
@ 2009-10-18 14:42                   ` Stefan Richter
  2009-10-19 13:42                     ` Alan Stern
  0 siblings, 1 reply; 6+ messages in thread
From: Stefan Richter @ 2009-10-18 14:42 UTC (permalink / raw)
  To: Tino Keitel; +Cc: linux1394-user, linux-scsi, Tejun Heo, Alan Stern

[Repost with corrected CCs, sorry for the mess.
Problem:  FireWire disk becomes inaccessible during resume because START
STOP UNIT failed.  http://marc.info/?t=125481515600002]

On 2009-10-16, Tino Keitel wrote at linux1394-user:
> On Sun, Oct 11, 2009 at 23:55:03 +0200, Stefan Richter wrote:
>> Tino Keitel wrote:
>>> I got another failure with the 0x20 workaround enabled. I
>>> suppose that it is a hardware issue. :-(
>> If you have access to a Windows PC, check whether there is a firmware
>> update for this disk.
>>
>> Besides, maybe the SCSI stack gives up too quickly if a command in the
>> resume path fails.  Just a guess; I never dealt with that kind of kernel
>> code myself.  I'll try to look it up when I have some time to kill...
> 
> This brought me to an idea: I just added a retry loop around the
> command to start the disk. This morning, it became effective for the
> first time:
> 
> sd 4:0:0:0: [sdb] Starting disk
> sd 4:0:0:0: [sdb] START_STOP FAILED, retrying.
> sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both
> firewire_core: rediscovered device fw1
> sd 4:0:0:0: [sdb] START_STOP FAILED, retrying.
> firewire_sbp2: fw1.0: reconnected to LUN 0000 (0 retries)
> usb 4-1: reset full speed USB device using uhci_hcd and address 2
> usb 5-2: reset full speed USB device using uhci_hcd and address 3
> usb 5-1: reset full speed USB device using uhci_hcd and address 8
> Restarting tasks ... done.
> 
> So it actually worked.
> 
> The retries are sent with a 2 seconds inverval.
> 
> It seems to me that the restart always fails if the "rediscovered
> device fw1" resp.  "firewire_sbp2: fw1.0: reconnected to LUN 0000"
> message comes after the "[sdb] Starting disk" message.  That would
> sound like an actual bug to me.

It is not a bug.  IEEE 1394 rediscovery and SBP-2 reconnect can become
necessary anytime (and they do become necessary at /least/ once during
PM resume), in no particular order with respect to SCSI request
submission.  Our drivers (firewire-sbp2 mainly) need to be able to
handle any order of such events.

> I just checked my kernel logs and saw exactly that: at every failed
> resume, the "Starting disk" message came before the "rediscovered
> device fw1" message. I guess that there is no need to throw away the
> enclosure anymore. :-)
> 
> Regards,
> Tino

Interesting findings.

There are two independent places of the code that could possibly be
improved to fix this issue:

a.)  sd's PM resume method:

1.a)  sd_resume could gain this retry loop which you implemented.

1.b)  sd_resume (but probably not sd_suspend) could optimistically
ignore any error return from sd_start_stop_device.  If the motor cannot
be started immediately at resume, the SCSI core would try to start it
later on when the disk is normally accessed.

My assumption here is that an error return from sd_resume causes the
disk to become inaccessible (taken offline?).

2.)  firewire-sbp2's bus reset handling scheme (the reconnect thing):

The originally submitted incarnation of firewire-sbp2 had very weak bus
reset handling which lost contact to disks very easily.  I then ported
over drivers/ieee1394/sbp2.c's bus reset handling to drivers/firewire
although I was not satisfied with that implementation anymore either.
This scheme uses the SCSI core's host block/ unblock API to prevent
queuing of new commands after firewire-sbp2 detected that a reconnect
becomes necessary, until reconnect succeeded.

After reconnect, already pending requests (at most one request at the
moment because we currently don't support queue depth > 1 in
firewire-sbp2) will be aborted and the SCSI request completed with
DID_BUS_BUSY.  And this is what apparently happens in your case:
sd_resume issues START STOP UNIT, sbp2 reconnects at the earliest
opportunity but alas after sd's request went out, the request is
completed with busy status, and sd_resume returns an error.

Instead of this, firewire-sbp2 should rather keep requests which are
present at reconnect and submit them once more.  Whether this is
actually feasible I don't know yet, but I have hopes.  If this is
possible, we can also rip out all usages of the Scsi_Host block/ unblock
API in firewire-sbp2, which is a very delicate API with a high danger of
deadlocks.
-- 
Stefan Richter
-=====-==--= =-=- =--=-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: disk restart failure after suspend
  2009-10-18 14:42                   ` disk restart failure after suspend Stefan Richter
@ 2009-10-19 13:42                     ` Alan Stern
  2009-10-19 18:05                       ` Tino Keitel
  2009-10-19 20:24                       ` Stefan Richter
  0 siblings, 2 replies; 6+ messages in thread
From: Alan Stern @ 2009-10-19 13:42 UTC (permalink / raw)
  To: Stefan Richter; +Cc: Tino Keitel, linux1394-user, linux-scsi, Tejun Heo

On Sun, 18 Oct 2009, Stefan Richter wrote:

> > It seems to me that the restart always fails if the "rediscovered
> > device fw1" resp.  "firewire_sbp2: fw1.0: reconnected to LUN 0000"
> > message comes after the "[sdb] Starting disk" message.  That would
> > sound like an actual bug to me.
> 
> It is not a bug.  IEEE 1394 rediscovery and SBP-2 reconnect can become
> necessary anytime (and they do become necessary at /least/ once during
> PM resume), in no particular order with respect to SCSI request
> submission.  Our drivers (firewire-sbp2 mainly) need to be able to
> handle any order of such events.

Is it possible to delay returning from the device resume routine until
the rediscovery/reconnect has completed?  This is more or less how the
USB stack works.

> Interesting findings.
> 
> There are two independent places of the code that could possibly be
> improved to fix this issue:
> 
> a.)  sd's PM resume method:
> 
> 1.a)  sd_resume could gain this retry loop which you implemented.

This wouldn't be necessary if the transport was working before 
sd_resume got called.

> 1.b)  sd_resume (but probably not sd_suspend) could optimistically
> ignore any error return from sd_start_stop_device.  If the motor cannot
> be started immediately at resume, the SCSI core would try to start it
> later on when the disk is normally accessed.

This is probably a worthwhile idea in any case.

> My assumption here is that an error return from sd_resume causes the
> disk to become inaccessible (taken offline?).

No.  All it does is cause an error message to be printed in the system 
log.  But it's possible that a failure lower down in the SCSI stack has 
this effect.

Alan Stern


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: disk restart failure after suspend
  2009-10-19 13:42                     ` Alan Stern
@ 2009-10-19 18:05                       ` Tino Keitel
  2009-10-19 20:24                       ` Stefan Richter
  1 sibling, 0 replies; 6+ messages in thread
From: Tino Keitel @ 2009-10-19 18:05 UTC (permalink / raw)
  To: Alan Stern; +Cc: Tejun Heo, Stefan Richter, linux1394-user, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 81 bytes --]

Hi,

for reference, I attach the patch I used to workaround this.

Regards,
Tino

[-- Attachment #2: retry_restart_disk.diff --]
[-- Type: text/x-diff, Size: 789 bytes --]

--- sd.c	2009-10-19 14:19:20.199156476 +0200
+++ linux-2.6.31/drivers/scsi/sd.c	2009-10-14 22:38:27.922974268 +0200
@@ -2183,6 +2185,7 @@
 	struct scsi_sense_hdr sshdr;
 	struct scsi_device *sdp = sdkp->device;
 	int res;
+	int i = 0;
 
 	if (start)
 		cmd[4] |= 1;	/* START */
@@ -2193,8 +2196,13 @@
 	if (!scsi_device_online(sdp))
 		return -ENODEV;
 
-	res = scsi_execute_req(sdp, cmd, DMA_NONE, NULL, 0, &sshdr,
-			       SD_TIMEOUT, SD_MAX_RETRIES, NULL);
+	while (res = scsi_execute_req(sdp, cmd, DMA_NONE, NULL, 0, &sshdr,
+			       SD_TIMEOUT, SD_MAX_RETRIES, NULL) && i < 10) {
+		i++;
+		sd_printk(KERN_WARNING, sdkp, "START_STOP FAILED, retrying.\n");
+		msleep(2000);
+	}
+
 	if (res) {
 		sd_printk(KERN_WARNING, sdkp, "START_STOP FAILED\n");
 		sd_print_result(sdkp, res);

[-- Attachment #3: Type: text/plain, Size: 399 bytes --]

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference

[-- Attachment #4: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: disk restart failure after suspend
  2009-10-19 13:42                     ` Alan Stern
  2009-10-19 18:05                       ` Tino Keitel
@ 2009-10-19 20:24                       ` Stefan Richter
  2009-10-19 20:28                         ` Tino Keitel
  1 sibling, 1 reply; 6+ messages in thread
From: Stefan Richter @ 2009-10-19 20:24 UTC (permalink / raw)
  To: Alan Stern; +Cc: Tino Keitel, linux1394-user, linux-scsi, Tejun Heo

Alan Stern wrote:
> On Sun, 18 Oct 2009, Stefan Richter wrote:
>> IEEE 1394 rediscovery and SBP-2 reconnect can become
>> necessary anytime (and they do become necessary at /least/ once during
>> PM resume), in no particular order with respect to SCSI request
>> submission.  Our drivers (firewire-sbp2 mainly) need to be able to
>> handle any order of such events.
> 
> Is it possible to delay returning from the device resume routine until
> the rediscovery/reconnect has completed?  This is more or less how the
> USB stack works.

Hmm.  FireWire isn't deterministic in this regard; it's partly bus,
partly network.  The transport protocol SBP-2 is kind of a network
protocol with remote DMA.  Rediscovery and reconnect at PM resume are
rather stochastic processes
  - if the target went through a low power state too,
  - if other nodes besides the Linux SBP-2 initiator and the SBP-2
    target are on the bus,
  - not to mention if those other nodes went through a low power
    cycle as well.

I could add .suspend and .resume methods to firewire-sbp2's struct
device_driver (or just .resume if the PM core accepts that... I have to
check the API), and the .resume method could contain a
wait_for_completion_timeout which is unblocked when a reconnect
happened.  However, this could still go wrong if for some reason (e.g.
see above) multiple reconnects to the target happen in a row.

So I tend to think firewire-sbp2 should learn to resubmit requests that
were queued by SCSI midlayer after the SBP-2 connection broke & before
reconnect happened, i.e. hide all this from SCSI midlayer rather than
quitting this request with DID_BUS_BUSY.

>> There are two independent places of the code that could possibly be
>> improved to fix this issue:
>>
>> a.)  sd's PM resume method:
>>
>> 1.a)  sd_resume could gain this retry loop which you implemented.
> 
> This wouldn't be necessary if the transport was working before 
> sd_resume got called.

Technically the transport does "work" at this time:  It might have
blocked the Scsi_Host though, or it might return "bus busy" status for
one request and then block the host.  But apparently that's not liked by
upper layers during resume.

Anyway, I'd say it this way:

This wouldn't be necessary if the transport just hid this reconnection
phase from SCSI core and everything above it.

Then we only need to rely on the reconnect (or possibly series of
reconnects, see above) to finish before timeout, minus time for the
actual execution of the request.  That should fit comfortably into the
30 seconds SD_TIMEOUT.

>> 1.b)  sd_resume (but probably not sd_suspend) could optimistically
>> ignore any error return from sd_start_stop_device.  If the motor cannot
>> be started immediately at resume, the SCSI core would try to start it
>> later on when the disk is normally accessed.
> 
> This is probably a worthwhile idea in any case.
> 
>> My assumption here is that an error return from sd_resume causes the
>> disk to become inaccessible (taken offline?).
> 
> No.  All it does is cause an error message to be printed in the system 
> log.  But it's possible that a failure lower down in the SCSI stack has 
> this effect.

I wonder what this might be.
-- 
Stefan Richter
-=====-==--= =-=- =--==
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: disk restart failure after suspend
  2009-10-19 20:24                       ` Stefan Richter
@ 2009-10-19 20:28                         ` Tino Keitel
  2009-10-19 21:50                           ` Stefan Richter
  0 siblings, 1 reply; 6+ messages in thread
From: Tino Keitel @ 2009-10-19 20:28 UTC (permalink / raw)
  To: Stefan Richter; +Cc: Alan Stern, linux1394-user, linux-scsi, Tejun Heo

On Mon, Oct 19, 2009 at 22:24:43 +0200, Stefan Richter wrote:

[...]

> > No.  All it does is cause an error message to be printed in the system 
> > log.  But it's possible that a failure lower down in the SCSI stack has 
> > this effect.
> 
> I wonder what this might be.

I my case, XFS threw a lot of errors because of an incaccessible device
and set the filesystem offline.

Regards,
Tino

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: disk restart failure after suspend
  2009-10-19 20:28                         ` Tino Keitel
@ 2009-10-19 21:50                           ` Stefan Richter
  0 siblings, 0 replies; 6+ messages in thread
From: Stefan Richter @ 2009-10-19 21:50 UTC (permalink / raw)
  To: Tino Keitel; +Cc: Alan Stern, linux1394-user, linux-scsi, Tejun Heo

Tino Keitel wrote:
> On Mon, Oct 19, 2009 at 22:24:43 +0200, Stefan Richter wrote:
> 
> [...]
> 
>>> No.  All it does is cause an error message to be printed in the system 
>>> log.  But it's possible that a failure lower down in the SCSI stack has 
>>> this effect.
>> I wonder what this might be.
> 
> I my case, XFS threw a lot of errors because of an incaccessible device
> and set the filesystem offline.

Perhaps this happens:

  - START STOP UNIT does not get through to the disk since
    firewire-sbp2 could not reconnect early enough.
  - The kernel _does not_ actually care, unlike I thought.
    It logs a device resume failure like Alan mentioned, then
    goes about its business.
  - Linux attempts to use the filesystem and issues read/ write/
    whatever requests.
  - Spindle motor is still off, disk returns errors.

At this moment, the target should fail these requests with a specific
sense code which tells the SCSI core that another START STOP UNIT is
required now.  However, it is just as likely that the target firmware
sends something nonsensical or nothing at all.  Or maybe the SCSI core
does even proceed to request START STOP UNIT but the firmware already
went belly-up.  Finally, SCSI core gives up and takes the disk offline.
-- 
Stefan Richter
-=====-==--= =-=- =--==
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-10-19 21:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20091006073926.GA5636@mac.home>
     [not found] ` <4ACB2B69.20008@s5r6.in-berlin.de>
     [not found]   ` <20091006115121.GA15517@mac.home>
     [not found]     ` <4ACB6940.5010905@s5r6.in-berlin.de>
     [not found]       ` <4ACB6BF2.5090409@s5r6.in-berlin.de>
     [not found]         ` <20091007051614.GA7527@mac.home>
     [not found]           ` <4ACC2ADC.4050307@s5r6.in-berlin.de>
     [not found]             ` <20091011202902.GA29604@x61.home>
     [not found]               ` <4AD25437.90609@s5r6.in-berlin.de>
     [not found]                 ` <20091016060320.GB30389@mac.home>
2009-10-18 14:42                   ` disk restart failure after suspend Stefan Richter
2009-10-19 13:42                     ` Alan Stern
2009-10-19 18:05                       ` Tino Keitel
2009-10-19 20:24                       ` Stefan Richter
2009-10-19 20:28                         ` Tino Keitel
2009-10-19 21:50                           ` Stefan Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox