Suspending SCSI devices and buses

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* Suspending SCSI devices and buses
@ 2004-08-18 20:36 Alan Stern
  2004-08-18 20:42 ` Christoph Hellwig
  2004-08-18 20:49 ` Nathan Bryant
  0 siblings, 2 replies; 15+ messages in thread
From: Alan Stern @ 2004-08-18 20:36 UTC (permalink / raw)
  To: SCSI development list

How does a host driver tell the SCSI core that an adapter is being 
suspended (and hence the core needs to suspend all the devices attached to 
that adapter)?  Ditto for resume.

There doesn't appear to be any way to do it.  The scsi_bus_type and 
various driver structures don't contain a "suspend" entry.

Alan Stern

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-18 20:36 Alan Stern
@ 2004-08-18 20:42 ` Christoph Hellwig
  2004-08-18 20:49 ` Nathan Bryant
  1 sibling, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2004-08-18 20:42 UTC (permalink / raw)
  To: Alan Stern; +Cc: SCSI development list

On Wed, Aug 18, 2004 at 04:36:53PM -0400, Alan Stern wrote:
> How does a host driver tell the SCSI core that an adapter is being 
> suspended (and hence the core needs to suspend all the devices attached to 
> that adapter)?  Ditto for resume.
> 
> There doesn't appear to be any way to do it.  The scsi_bus_type and 
> various driver structures don't contain a "suspend" entry.

Please read through the "[PATCH] SCSI midlayer power management" thread on
lkml and linux-scsi from last week.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-18 20:36 Alan Stern
  2004-08-18 20:42 ` Christoph Hellwig
@ 2004-08-18 20:49 ` Nathan Bryant
  2004-08-19 21:05   ` Alan Stern
  1 sibling, 1 reply; 15+ messages in thread
From: Nathan Bryant @ 2004-08-18 20:49 UTC (permalink / raw)
  To: Alan Stern; +Cc: SCSI development list

Alan Stern wrote:
> How does a host driver tell the SCSI core that an adapter is being 
> suspended (and hence the core needs to suspend all the devices attached to 
> that adapter)?  Ditto for resume.

For devices, this should be handled by the applicable scsi_driver - 
sd.c, etc. I posted a patch for this to the mailing list a week or two 
ago, check the list archives.

> 
> There doesn't appear to be any way to do it.  The scsi_bus_type and 
> various driver structures don't contain a "suspend" entry.

Correct, the midlayer doesn't quite support this yet, but my patch 
should do the most important bits if somebody would be kind enough to 
merge it... As a quick summary, first we need to quiesce the child 
devices and synchronize their caches, then the LLD just needs to handle 
flushing any remaining in-flight DMA transactions.

I also have an example patch to do this for the aic7xxx LLD. Which you 
can also grab from the list archives.

Nathan

> 
> Alan Stern
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-18 20:49 ` Nathan Bryant
@ 2004-08-19 21:05   ` Alan Stern
  2004-08-19 21:20     ` Nathan Bryant
  2004-08-20 12:30     ` Nathan Bryant
  0 siblings, 2 replies; 15+ messages in thread
From: Alan Stern @ 2004-08-19 21:05 UTC (permalink / raw)
  To: Nathan Bryant; +Cc: SCSI development list

On Wed, 18 Aug 2004, Nathan Bryant wrote:

> Alan Stern wrote:
> > How does a host driver tell the SCSI core that an adapter is being 
> > suspended (and hence the core needs to suspend all the devices attached to 
> > that adapter)?  Ditto for resume.
> 
> For devices, this should be handled by the applicable scsi_driver - 
> sd.c, etc. I posted a patch for this to the mailing list a week or two 
> ago, check the list archives.
> 
> > 
> > There doesn't appear to be any way to do it.  The scsi_bus_type and 
> > various driver structures don't contain a "suspend" entry.
> 
> Correct, the midlayer doesn't quite support this yet, but my patch 
> should do the most important bits if somebody would be kind enough to 
> merge it... As a quick summary, first we need to quiesce the child 
> devices and synchronize their caches, then the LLD just needs to handle 
> flushing any remaining in-flight DMA transactions.

Thanks.  Looking at your patch, I have a question.  It doesn't look like
the resume path is careful to check for Unit Attention with Power On or
Medium May Have Changed sense.  What happens if somebody changes the
medium while the drive is suspended?  Or am I missing something?

Alan Stern


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-19 21:05   ` Alan Stern
@ 2004-08-19 21:20     ` Nathan Bryant
  2004-08-20 12:30     ` Nathan Bryant
  1 sibling, 0 replies; 15+ messages in thread
From: Nathan Bryant @ 2004-08-19 21:20 UTC (permalink / raw)
  To: Alan Stern; +Cc: SCSI development list

Alan Stern wrote:

> Thanks.  Looking at your patch, I have a question.  It doesn't look like
> the resume path is careful to check for Unit Attention with Power On or
> Medium May Have Changed sense.  What happens if somebody changes the
> medium while the drive is suspended?  Or am I missing something?

I just do the same checks that we do on boot. I suppose if somebody does 
this with a mounted filesystem on removeable media the results might be 
rather comical...

Does "Medium May Have Changed" show up reliably if somebody changes the 
medium in a cartridge drive, Zip drive, etc while it is powered down? 
Doesn't the kernel already notice that sense if it happens during normal 
operation?

Nathan

> 
> Alan Stern
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-19 21:05   ` Alan Stern
  2004-08-19 21:20     ` Nathan Bryant
@ 2004-08-20 12:30     ` Nathan Bryant
  2004-08-20 13:35       ` Luben Tuikov
  2004-08-20 15:08       ` Alan Stern
  1 sibling, 2 replies; 15+ messages in thread
From: Nathan Bryant @ 2004-08-20 12:30 UTC (permalink / raw)
  To: Alan Stern; +Cc: SCSI development list

Alan Stern wrote:

>Thanks.  Looking at your patch, I have a question.  It doesn't look like
>the resume path is careful to check for Unit Attention with Power On or
>Medium May Have Changed sense.  What happens if somebody changes the
>medium while the drive is suspended?  Or am I missing something?
>
You're not missing anything. :(

Thanks for the feedback, looks like you've found a real problem with the 
patch: that is, due to the unconditonal spinup call on resume, we clear 
any UNIT ATTENTION state before any of the upper layers ever see it, so 
nobody will notice a possible media change.

Unfortunately, I think that the current media change detection code in 
the Linux kernel can not distinguish power-on events from media change 
events. I'm not sure doing so is even possible for SCSI devices. 
(Comments on that?) Proposed solutions:

Approach #1:
* Continue to do the unconditional spinup, but only for devices that are 
already mounted.
This may miss some media change events, but if we really can't 
distinguish power-on from media change, maybe that's somebody else's 
problem if the device was already mounted. (Changing mounted media is 
the user's fault.)

(Hmm, what about devices that are opened for read/write but not mounted?)

Approach #2:
* Test for UNIT_ATTENTION before spinning up and report this as a media 
change.
Safer, but may report "false positive" media change events if the device 
was only powered down/up.

Nathan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-20 12:30     ` Nathan Bryant
@ 2004-08-20 13:35       ` Luben Tuikov
  2004-08-20 14:33         ` Nathan Bryant
  2004-08-20 15:08       ` Alan Stern
  1 sibling, 1 reply; 15+ messages in thread
From: Luben Tuikov @ 2004-08-20 13:35 UTC (permalink / raw)
  To: Nathan Bryant; +Cc: Alan Stern, SCSI development list

Nathan Bryant wrote:
> Alan Stern wrote:
> 
>  >Thanks.  Looking at your patch, I have a question.  It doesn't look like
>  >the resume path is careful to check for Unit Attention with Power On or
>  >Medium May Have Changed sense.  What happens if somebody changes the
>  >medium while the drive is suspended?  Or am I missing something?
>  >
> You're not missing anything. :(
> 
> Thanks for the feedback, looks like you've found a real problem with the
> patch: that is, due to the unconditonal spinup call on resume, we clear
> any UNIT ATTENTION state before any of the upper layers ever see it, so
> nobody will notice a possible media change.

If UA exists for the initiator port, commands other INQURY, REPORT LUNS
and REPORT SENSE, get terminated and UA reported (CHECK CONDITION with
sense data).  So spinup (TUR + START STOP UNIT) could report the media
change (on TUR).

> 
> Unfortunately, I think that the current media change detection code in
> the Linux kernel can not distinguish power-on events from media change
> events. I'm not sure doing so is even possible for SCSI devices.

It would really depend on the device (if it kept state over power failures).
But it is true that the code should be able to handle it.  (Also UAs could
be queued by the device server--the code could use this.)

> (Comments on that?) Proposed solutions:
> 
> Approach #1:
> * Continue to do the unconditional spinup, but only for devices that are
> already mounted.
> This may miss some media change events, but if we really can't
> distinguish power-on from media change, maybe that's somebody else's
> problem if the device was already mounted. (Changing mounted media is
> the user's fault.)
> 
> (Hmm, what about devices that are opened for read/write but not mounted?)
> 
> Approach #2:
> * Test for UNIT_ATTENTION before spinning up and report this as a media
> change.
> Safer, but may report "false positive" media change events if the device
> was only powered down/up.

UA can tell you when "removable medium may have changed" (SAM3r13, 5.9.7, b)
and "LU inventory has been changed" (same, g).

But it is true -- I can imagine media being changed without the device
knowing it (old enclosure + power off + screw-driver :-) ).

		Luben



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-20 13:35       ` Luben Tuikov
@ 2004-08-20 14:33         ` Nathan Bryant
  0 siblings, 0 replies; 15+ messages in thread
From: Nathan Bryant @ 2004-08-20 14:33 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: Alan Stern, SCSI development list

Luben Tuikov wrote:
> Nathan Bryant wrote:
> 
>> Alan Stern wrote:
>>
>>  >Thanks.  Looking at your patch, I have a question.  It doesn't look 
>> like
>>  >the resume path is careful to check for Unit Attention with Power On or
>>  >Medium May Have Changed sense.  What happens if somebody changes the
>>  >medium while the drive is suspended?  Or am I missing something?
>>  >
>> You're not missing anything. :(
>>
>> Thanks for the feedback, looks like you've found a real problem with the
>> patch: that is, due to the unconditonal spinup call on resume, we clear
>> any UNIT ATTENTION state before any of the upper layers ever see it, so
>> nobody will notice a possible media change.
> 
> 
> If UA exists for the initiator port, commands other INQURY, REPORT LUNS
> and REPORT SENSE, get terminated and UA reported (CHECK CONDITION with
> sense data).  So spinup (TUR + START STOP UNIT) could report the media
> change (on TUR).
> 
>>
>> Unfortunately, I think that the current media change detection code in
>> the Linux kernel can not distinguish power-on events from media change
>> events. I'm not sure doing so is even possible for SCSI devices.
> 
> 
> It would really depend on the device (if it kept state over power 
> failures).
> But it is true that the code should be able to handle it.  (Also UAs could
> be queued by the device server--the code could use this.)
> 
>> (Comments on that?) Proposed solutions:
>>
>> Approach #1:
>> * Continue to do the unconditional spinup, but only for devices that are
>> already mounted.
>> This may miss some media change events, but if we really can't
>> distinguish power-on from media change, maybe that's somebody else's
>> problem if the device was already mounted. (Changing mounted media is
>> the user's fault.)
>>
>> (Hmm, what about devices that are opened for read/write but not mounted?)
>>
>> Approach #2:
>> * Test for UNIT_ATTENTION before spinning up and report this as a media
>> change.
>> Safer, but may report "false positive" media change events if the device
>> was only powered down/up.
> 
> 
> UA can tell you when "removable medium may have changed" (SAM3r13, 
> 5.9.7, b)
> and "LU inventory has been changed" (same, g).

The additional sense code for this is 0x28 0x00, or NOT READY TO READY 
TRANSITION (MEDIUM MAY HAVE CHANGED)

The kernel doesn't check for this sense code, currently we interpret all 
UNIT_ATTENTION states as a media change. Might changing to check the 
additional sense code regress things for older devices? I see that the 
0x28 0x00 code is defined in SCSI-2, but I don't know whether it was 
defined in SCSI-1.

I suspect that some (perhaps many) devices will see a power up as a 
media load event, for mechanical reasons. Also technically speaking, a 
not-ready-to-ready state transition could be interpreted to occur every 
time we spin the device up, no?

Well, I don't really use removable SCSI devices anymore, but I have an 
old 1Gig Jaz drive that I can play with when I find some time, assuming 
it still works. It used to seem a little flaky, but maybe that was my 
bus. This device also does an auto-spindown, I think, and it would be 
interesting to know what status it reports when it spins back up.

Nathan

> 
> But it is true -- I can imagine media being changed without the device
> knowing it (old enclosure + power off + screw-driver :-) ).
> 
>         Luben
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-20 12:30     ` Nathan Bryant
  2004-08-20 13:35       ` Luben Tuikov
@ 2004-08-20 15:08       ` Alan Stern
  2004-08-20 15:53         ` Nathan Bryant
  1 sibling, 1 reply; 15+ messages in thread
From: Alan Stern @ 2004-08-20 15:08 UTC (permalink / raw)
  To: Nathan Bryant; +Cc: SCSI development list

On Fri, 20 Aug 2004, Nathan Bryant wrote:

> Thanks for the feedback, looks like you've found a real problem with the 
> patch: that is, due to the unconditonal spinup call on resume, we clear 
> any UNIT ATTENTION state before any of the upper layers ever see it, so 
> nobody will notice a possible media change.
> 
> Unfortunately, I think that the current media change detection code in 
> the Linux kernel can not distinguish power-on events from media change 
> events. I'm not sure doing so is even possible for SCSI devices. 
> (Comments on that?) Proposed solutions:

I don't know to what extent the sd driver detects and handles power-on or 
media change events.  I'm also not sure how much point there is in trying 
to distinguish them.  If you can't tell whether the medium was changed 
while the power was off, you should assume that it was for safety's 
sake.  Shucks, the entire drive may have been cold-swapped.

> Approach #1:
> * Continue to do the unconditional spinup, but only for devices that are 
> already mounted.
> This may miss some media change events, but if we really can't 
> distinguish power-on from media change, maybe that's somebody else's 
> problem if the device was already mounted. (Changing mounted media is 
> the user's fault.)

It doesn't matter whose fault it is; if a mounted medium is changed the
driver should be careful to invalidate the existing file references so
that the new medium isn't corrupted and the errors can be propagated back
up to the user.  That's how it seems to me, anyway -- existing kernel 
policy might be different.  How does the floppy driver handle these 
things?

Also bear in mind, while a user may be smart enough not to change a
mounted medium, when the system is suspended it's not so easy to tell
what's mounted.  Mistakes are much more likely to happen during suspend
than during normal operations.

> (Hmm, what about devices that are opened for read/write but not mounted?)

IMHO they should be treated equivalently.  Neither the new medium nor the
I/O stream should be allowed to get corrupted.  Does sd.c even know 
whether or not an open file handle refers to a mounted volume?

> Approach #2:
> * Test for UNIT_ATTENTION before spinning up and report this as a media 
> change.
> Safer, but may report "false positive" media change events if the device 
> was only powered down/up.

Or test _during_ spin-up.  But note that not all Unit Attentions are bad.
Anyway the driver should already be checking for UAs that indicate
power-on or media change (but I don't know whether it actually does so).

For that matter, the driver doesn't seem to care very much about 
PREVENT-ALLOW MEDIUM REMOVAL.  Sure, it issues the PREVENT command when 
the device is opened, but it doesn't recognize that the drive forgets the 
prevent/allow state whenever it is reset.

Also remember, some devices can't be spun up and actively dislike the 
START-STOP UNIT command (to the point of crashing when they receive it).  
I mention this just to make your life more difficult.  :-)

Alan Stern

P.S.: Slightly off-topic for this thread...  Why does sd.c probe for
write-protect only on devices with removable media?  Some fixed-media
devices can also be write-protected.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-20 15:08       ` Alan Stern
@ 2004-08-20 15:53         ` Nathan Bryant
  2004-08-20 16:43           ` Alan Stern
  0 siblings, 1 reply; 15+ messages in thread
From: Nathan Bryant @ 2004-08-20 15:53 UTC (permalink / raw)
  To: Alan Stern; +Cc: SCSI development list

Alan Stern wrote:
> I don't know to what extent the sd driver detects and handles power-on or 
> media change events.  I'm also not sure how much point there is in trying 
> to distinguish them.  If you can't tell whether the medium was changed 
> while the power was off, you should assume that it was for safety's 
> sake.  Shucks, the entire drive may have been cold-swapped.

True, but users could cold-swap a non-removable drive too, and I don't 
think we should make the same assumptions there. Users expect resume 
from suspend to be fast, so needlessly invalidating all our buffer cache 
is silly.

Point being that we should probably do something about removable drives, 
but a hardcoded invalidate on resume is overkill.

> 
> 
>>Approach #1:
>>* Continue to do the unconditional spinup, but only for devices that are 
>>already mounted.
>>This may miss some media change events, but if we really can't 
>>distinguish power-on from media change, maybe that's somebody else's 
>>problem if the device was already mounted. (Changing mounted media is 
>>the user's fault.)
> 
> 
> It doesn't matter whose fault it is; if a mounted medium is changed the
> driver should be careful to invalidate the existing file references so
> that the new medium isn't corrupted and the errors can be propagated back
> up to the user.  That's how it seems to me, anyway -- existing kernel 
> policy might be different.  How does the floppy driver handle these 
> things?

Historically, not very well. If you eject a floppy before all dirty 
buffers are flushed, there's not much we can do for you, even if you 
realize your mistake and put the original disk back you're still 
screwed. Luckily we should flush buffers before suspend, so this may 
mitigate...

Also check out fs/block_dev.c and note the printk "VFS: busy inodes on 
changed media." I don't know that we deal with that condition too well. 
Strangely, I could swear that I've seen this message from the kernel, 
but the comment claims that it's only called during mount/open...

>>(Hmm, what about devices that are opened for read/write but not mounted?)
> 
> 
> IMHO they should be treated equivalently.  Neither the new medium nor the
> I/O stream should be allowed to get corrupted.  Does sd.c even know 
> whether or not an open file handle refers to a mounted volume?

A mounted fs has more complexities than a block device that is only 
opened by userspace. For the latter case we can invalidate all buffers 
and after that the only question is how aggressive to be about passing 
errors up to userspace. For a mounted fs we may have to deal with busy 
inodes and I don't know how well the VFS layer deals with that. So there 
may yet turn out to be a reason to handle them differently.

> Or test _during_ spin-up.  But note that not all Unit Attentions are bad.
> Anyway the driver should already be checking for UAs that indicate
> power-on or media change (but I don't know whether it actually does so).

True. But we currently don't make the distinction between good/bad UA's. 
If we introduce the distinction, we need to be careful about 
regressions. Currently the default is that all UA's are "bad" (trigger a 
media change.) It may make the most sense to keep that as the default 
and only exclude certain additional sense codes that are known to be 
good. In order for those exceptions to be added, someone might have to 
complain about it first, since I'm not sure we can make that change 
without a testcase ;)

> 
> For that matter, the driver doesn't seem to care very much about 
> PREVENT-ALLOW MEDIUM REMOVAL.  Sure, it issues the PREVENT command when 
> the device is opened, but it doesn't recognize that the drive forgets the 
> prevent/allow state whenever it is reset.
> 
> Also remember, some devices can't be spun up and actively dislike the 
> START-STOP UNIT command (to the point of crashing when they receive it).  
> I mention this just to make your life more difficult.  :-)

I'm aware of that. That's why I'm using the rescan code, since it knows 
about this issue and attempts to TEST UNIT READY first.

It's certainly annoying since START STOP UNIT is extended to specify 
power-management states in the newer SBC standard. :( Maybe it would 
work to just stick with the power control mode parameters from the SPC 
standard, though...

> 
> Alan Stern
> 
> P.S.: Slightly off-topic for this thread...  Why does sd.c probe for
> write-protect only on devices with removable media?  Some fixed-media
> devices can also be write-protected.
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-20 15:53         ` Nathan Bryant
@ 2004-08-20 16:43           ` Alan Stern
  0 siblings, 0 replies; 15+ messages in thread
From: Alan Stern @ 2004-08-20 16:43 UTC (permalink / raw)
  To: Nathan Bryant; +Cc: SCSI development list

On Fri, 20 Aug 2004, Nathan Bryant wrote:

> True, but users could cold-swap a non-removable drive too, and I don't 
> think we should make the same assumptions there.

Well, you could assume that if a drive with non-removable media sends
ASC x28 it's not because the medium has changed.  Apart from that, I think 
the assumptions should be pretty much the same.  The real question is 
whether a power change will result in ASC x29 -- some drives might not 
send it at all, others might send x28 instead.

>  Users expect resume 
> from suspend to be fast, so needlessly invalidating all our buffer cache 
> is silly.
> 
> Point being that we should probably do something about removable drives, 
> but a hardcoded invalidate on resume is overkill.

I agree that in the absence of RMB, ASC x28 probably shouldn't invalidate
anything.


> > IMHO they should be treated equivalently.  Neither the new medium nor the
> > I/O stream should be allowed to get corrupted.  Does sd.c even know 
> > whether or not an open file handle refers to a mounted volume?
> 
> A mounted fs has more complexities than a block device that is only 
> opened by userspace. For the latter case we can invalidate all buffers 
> and after that the only question is how aggressive to be about passing 
> errors up to userspace. For a mounted fs we may have to deal with busy 
> inodes and I don't know how well the VFS layer deals with that. So there 
> may yet turn out to be a reason to handle them differently.

Well yes, but we have to expect that VFS will learn to handle its own
problems.  At the level of sd.c, I'm not aware that the code can
distinguish between a userspace open and a mount.  If it can't then it has
no choice but to treat them the same.


>  But we currently don't make the distinction between good/bad UA's. 
> If we introduce the distinction, we need to be careful about 
> regressions. Currently the default is that all UA's are "bad" (trigger a 
> media change.) It may make the most sense to keep that as the default 
> and only exclude certain additional sense codes that are known to be 
> good. In order for those exceptions to be added, someone might have to 
> complain about it first, since I'm not sure we can make that change 
> without a testcase ;)

That sounds like a reasonable approach.

Alan Stern


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
@ 2004-08-20 20:49 Pat LaVarre
  2004-08-20 21:38 ` Nathan Bryant
  0 siblings, 1 reply; 15+ messages in thread
From: Pat LaVarre @ 2004-08-20 20:49 UTC (permalink / raw)
  To: linux-scsi; +Cc: Alan Stern

 > a hardcoded invalidate on resume is overkill.

Theoretically yes.

 > current ... Linux kernel ... not distinguish
 > power-on events from media change events.

One case for leaving Linux pessimistic that way is ...

 > I'm not sure doing so is even possible for SCSI devices.
 > (Comments on that?)

SK ASC x 6 29 UnitAttention Reset is the only notice a host gets of a 
wall-powered drive unplugged from one host and plugged into another, 
when the bus does not distinctly identify the host, as in USB, etc.  
 From the drive's point of view, that is not an SK ASC x 6 28 
UnitAttention GoneReady ... yet still from the host's point of view, 
that is a media change.

Same for drives that don't let you remove the media: those drives might 
report no unit attentions at all, again implying that plug-in or resume 
of power is always a potential media change for the host.

 > old 1Gig Jaz drive

Those drives often report x 2 04 02 (else x 2 04 00 meaning x 2 04 
00..FF which includes x 2 04 02) while auto spun down, to help the host 
avoid timing out a read that includes spin up delay.  Their auto spin 
down time is configurable with vendor-specific protocol between resets. 
  Insertion spinup, eject spindown, while spinning up and down are all 
more intricate.

 > Does "Medium May Have Changed" show up reliably
 > if somebody changes the medium in a cartridge drive,
 > Zip drive, etc while it is powered down?

Relevant device design questions include:

1) Do enough hosts consistently limit how stale cache may be?

2) Do enough hosts reliably tolerate x 6 28 GoneReady produced by a 
suspend-resume power cycle without media change?

3) Does the device itself cache part of the disc?

4) Is the disc robust enough to tolerate a auto fetch of its identity 
after each resume or power on?

5) Does swapping the disc detectably change a mechanical state in the 
drive?

6) Was the device expensive enough to include a non-volatile record of 
the identity of the disc present before each suspend or power off?

In design, me, I'd vote to have the device pessimistically queue any UA 
that might be necessary, and let the host sort out which are real, but 
I'm not sure how often people like me lose such votes.

Pat LaVarre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-20 20:49 Suspending SCSI devices and buses Pat LaVarre
@ 2004-08-20 21:38 ` Nathan Bryant
  2004-08-20 22:06   ` Pat LaVarre
  0 siblings, 1 reply; 15+ messages in thread
From: Nathan Bryant @ 2004-08-20 21:38 UTC (permalink / raw)
  To: Pat LaVarre; +Cc: linux-scsi, Alan Stern

Pat LaVarre wrote:

> > a hardcoded invalidate on resume is overkill.
>
> Theoretically yes.
>
> > current ... Linux kernel ... not distinguish
> > power-on events from media change events.
>
> One case for leaving Linux pessimistic that way is ...
>
> > I'm not sure doing so is even possible for SCSI devices.
> > (Comments on that?)
>
> SK ASC x 6 29 UnitAttention Reset is the only notice a host gets of a 
> wall-powered drive unplugged from one host and plugged into another, 
> when the bus does not distinctly identify the host, as in USB, etc.  
> From the drive's point of view, that is not an SK ASC x 6 28 
> UnitAttention GoneReady ... yet still from the host's point of view, 
> that is a media change.

We only see this sense when the device is unplugged from the wall in the 
process, right?

> Same for drives that don't let you remove the media: those drives 
> might report no unit attentions at all, again implying that plug-in or 
> resume of power is always a potential media change for the host.
>
> > old 1Gig Jaz drive
>
> Those drives often report x 2 04 02 (else x 2 04 00 meaning x 2 04 
> 00..FF which includes x 2 04 02) while auto spun down, to help the 
> host avoid timing out a read that includes spin up delay.  Their auto 
> spin down time is configurable with vendor-specific protocol between 
> resets.  Insertion spinup, eject spindown, while spinning up and down 
> are all more intricate.

OK, that translates as NOT_READY,     "LOGICAL UNIT NOT READY, 
INITIALIZING CMD. REQUIRED" Which means they require START UNIT, right?

>
> > Does "Medium May Have Changed" show up reliably
> > if somebody changes the medium in a cartridge drive,
> > Zip drive, etc while it is powered down?
>
> Relevant device design questions include:
>
> 1) Do enough hosts consistently limit how stale cache may be?
>
> 2) Do enough hosts reliably tolerate x 6 28 GoneReady produced by a 
> suspend-resume power cycle without media change?

According to recent SBC/SPC, drives are not supposed to produce x6 28 
for low-power conditions defined by the SCSI standard. There is a 
separate sense code for that: ILLEGAL REQUEST, ASC=LOW POWER CONDITION ON.

However, for SCSI devices hooked up to an ACPI capable desktop power 
supply, S3 suspend looks just like power-off to them. So in that case 
they're likely to report x6 29 and maybe also x6 28.

> 3) Does the device itself cache part of the disc?

With my patch, I'm sending SYNCHRONIZE CACHE on suspend.

> 4) Is the disc robust enough to tolerate a auto fetch of its identity 
> after each resume or power on?
>
> 5) Does swapping the disc detectably change a mechanical state in the 
> drive?
>
> 6) Was the device expensive enough to include a non-volatile record of 
> the identity of the disc present before each suspend or power off?
>
> In design, me, I'd vote to have the device pessimistically queue any 
> UA that might be necessary, and let the host sort out which are real, 
> but I'm not sure how often people like me lose such votes.

Kill 'em all, and let Ghod sort 'em out. :)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-20 21:38 ` Nathan Bryant
@ 2004-08-20 22:06   ` Pat LaVarre
  2004-08-20 22:32     ` Nathan Bryant
  0 siblings, 1 reply; 15+ messages in thread
From: Pat LaVarre @ 2004-08-20 22:06 UTC (permalink / raw)
  To: Nathan Bryant; +Cc: linux-scsi, Alan Stern

Nathan B:

>> > current ... Linux kernel ... not distinguish
>> > power-on events from media change events.
>> ...
>> SK ASC x 6 29 UnitAttention Reset is the only notice a host gets of a 
>> wall-powered drive unplugged from one host and plugged into another, 
>> when the bus does not distinctly identify the host, as in USB, etc.  
>> From the drive's point of view, that is not an SK ASC x 6 28 
>> UnitAttention GoneReady ... yet still from the host's point of view, 
>> that is a media change.
>
> We only see this sense when the device is unplugged from the wall in 
> the process, right?

No, help, sorry.  I mean to say, try this thought experiment:

1) Power both hosts.
2) Insert a disc, power the device, connect to one host.
3) Provoke that first host to clear the x 6 29 and x 6 28.
4) Disconnect the device from one host, connect to the other.

The second host will see only x 6 29, and indeed catch that only 
because the dis/connect reset the device.

That read-once-then-lost x 6 29 is the only signal the second host 
receives that means "media change" i.e. that some other host may have 
written blocks of the media.

In theory, in the specific case of USB, the wall-powered device could 
look to notice the loss of bus power, and report loss of bus power as 
an x 6 28.  But that device design choice would create more false 
positive media change: any host that killed bus power as part of 
suspend would thus produce x 6 28, like they do now when suspend/ 
resume power cycles bus-powered removable drives.

Also, be it truth or slander, I hear that non-removable drives don't 
reliably produce the x 6 29, so to catch this always the host itself 
has to see dis/connect as a potential media change.

>>  x 2 04 02
> translates as NOT_READY,     "LOGICAL UNIT NOT READY, INITIALIZING 
> CMD. REQUIRED"

Often, yes.

> Which means they require START UNIT, right?

Op x1B, yes.

>> 1) Do enough hosts consistently limit how stale cache may be?
>>
>> 2) Do enough hosts reliably tolerate x 6 28 GoneReady produced by a 
>> suspend-resume power cycle without media change?
>
> According to recent SBC/SPC, drives are not supposed to produce x6 28 
> for low-power conditions defined by the SCSI standard. There is a 
> separate sense code for that: ILLEGAL REQUEST, ASC=LOW POWER CONDITION 
> ON.
>
> However, for SCSI devices hooked up to an ACPI capable desktop power 
> supply, S3 suspend looks just like power-off to them. So in that case 
> they're likely to report x6 29 and maybe also x6 28.

Ah, fun.

> S3

I know this is popular PC jargon, I don't personally know what it means.

> suspend looks just like power-off

Aye, that holds for any bus-powered device whose min power demand 
exceeds max suspend supply.

>> 3) Does the device itself cache part of the disc?
>
> With my patch, I'm sending SYNCHRONIZE CACHE on suspend.

Me, I'd only rely on op x35 flushing written data blocks.

What significantly may remain cached thru op x35 is metadata, 
especially the read cache of metadata e.g. the real C:H:S of each LBA.

>> 4) Is the disc robust enough to tolerate a auto fetch of its identity 
>> after each resume or power on?
>>
>> 5) Does swapping the disc detectably change a mechanical state in the 
>> drive?
>>
>> 6) Was the device expensive enough to include a non-volatile record 
>> of the identity of the disc present before each suspend or power off?
>>
>> In design, me, I'd vote to have the device pessimistically queue any 
>> UA that might be necessary, and let the host sort out which are real, 
>> but I'm not sure how often people like me lose such votes.
>
> Kill 'em all, and let Ghod sort 'em out. :)

Yep. :)  Meanwhile, we could at least write a wikipedia article to 
argue our case ...

Pat LaVarre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Suspending SCSI devices and buses
  2004-08-20 22:06   ` Pat LaVarre
@ 2004-08-20 22:32     ` Nathan Bryant
  0 siblings, 0 replies; 15+ messages in thread
From: Nathan Bryant @ 2004-08-20 22:32 UTC (permalink / raw)
  To: Pat LaVarre; +Cc: linux-scsi, Alan Stern

Pat LaVarre wrote:

>> S3
>
>
> I know this is popular PC jargon, I don't personally know what it means.

ACPI suspend-to-RAM. Power is cut to almost everything except RAM and 
wakeup event circuitry, some PCI slots may be left in D3hot with wake 
events enabled. Motherboard signals the ATX power supply to cut power to 
all the peripheral connectors.

It's very similar to powerdown with the RAM refresh turned on.

Nathan

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-08-20 22:32 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-20 20:49 Suspending SCSI devices and buses Pat LaVarre
2004-08-20 21:38 ` Nathan Bryant
2004-08-20 22:06   ` Pat LaVarre
2004-08-20 22:32     ` Nathan Bryant
  -- strict thread matches above, loose matches on Subject: below --
2004-08-18 20:36 Alan Stern
2004-08-18 20:42 ` Christoph Hellwig
2004-08-18 20:49 ` Nathan Bryant
2004-08-19 21:05   ` Alan Stern
2004-08-19 21:20     ` Nathan Bryant
2004-08-20 12:30     ` Nathan Bryant
2004-08-20 13:35       ` Luben Tuikov
2004-08-20 14:33         ` Nathan Bryant
2004-08-20 15:08       ` Alan Stern
2004-08-20 15:53         ` Nathan Bryant
2004-08-20 16:43           ` Alan Stern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox